LNAI 3229 



Jose Julio Alferes 
Joao Leite (Eds.) 



Logics in 

Artificial Intelligence 

9th European Conference, JELIA 2004 
Lisbon, Portugal, September 2004 
Proceedings 




JIUA’04 



4^ Springer 



Lecture Notes in Artificial Intelligence 3229 

Edited by J. G. Carbonell and J. Siekmann 
Subseries of Lecture Notes in Computer Science 




Jose Julio Alferes Joao Leite (Eds.) 



Logics in 

Artificial Intelligence 



9th European Conference, JELIA 2004 
Lisbon, Portugal, September 27-30, 2004 
Proceedings 



4^ Springer 




Series Editors 

Jaime G. Carbonell, Carnegie Mellon University, Pittsburgh, PA, USA 
Jorg Siekmann, University of Saarland, Saarbriicken, Germany 

Volume Editors 

Jose Julio Alferes 
Joao Leite 

Universidade Nova de Lisboa 

Faculdade de Ciencias e Tecnologia, Departamento de Informatica 
2829-516 Caparica, Portugal 
E-mail: {jjajleite} @ di.fct.unl.pt 



Library of Congress Control Number: 20041 12842 



CR Subject Classification (1998): 1.2, F.4.1, D.1.6 
ISSN 0302-9743 

ISBN 3-540-23242-7 Springer Berlin Heidelberg New York 



This work is subject to copyright. All rights are reserved, whether the whole or part of the material is 
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, 
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication 
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, 
in its current version, and permission for use must always be obtained from Springer. Violations are liable 
to prosecution under the German Copyright Law. 

Springer is a part of Springer Science+Business Media 

springeronline.com 

© Springer-Verlag Berlin Heidelberg 2004 
Printed in Germany 

Typesetting: Camera-ready by author, data conversion by PTP-Berlin, Protago-TeX-Production GmbH 
Printed on acid-free paper SPIN: 1 1320197 06/3142 5 4 3 2 1 0 




Preface 



Logics have, for many years, laid claim to providing a formal basis for the study 
and development of applications and systems in artificial intelligence. With the 
depth and maturity of formalisms, methodologies and logic-based systems to- 
day, this claim is stronger than ever. The European Conference on Logics in 
Artificial Intelligence (or Journees Europeennes sur la Logique en Intelligence 
Artificielle, JELIA) began back in 1988, as a workshop, in response to the need 
for a European forum for the discussion of emerging work in this field. Since 
then, JELIA has been organized biennially, with English as its official language, 
previous meetings taking place in Roscoff, France (1988), Amsterdam, Nether- 
lands (1990), Berlin, Germany (1992), York, UK (1994), Evora, Portugal (1996), 
Dagstuhl, Germany (1998), Malaga, Spain (2000) and Cosenza, Italy (2002). The 
increasing interest in this forum, its international level with growing participation 
from researchers outside Europe, and the overall technical quality have turned 
JELIA into a major biennial forum for the discussion of logic-based approaches 
to artificial intelligence. 

The 9tlr European Conference on Logics in AI, JELIA 2004, took place in 
Lisbon, Portugal, between the 27tlr and the 30th of September 2004, and was 
hosted by the Universidade Nova de Lisboa. Its technical program comprised 3 
invited talks, by Francesca Rossi, Franz Baader, and Bernhard Nebel, and the 
presentation of 52 refereed technical articles selected by the Program Committee 
among the 144 that were submitted, a number which in our opinion clearly 
indicates that the research area of logics in AI is one with a great and increasing 
interest. 

It is our stance that the use of logics in AI will be further advanced if imple- 
mented logical-based systems receive appropriate exposure, implementation and 
testing methodologies are discussed by the community, and the performance and 
scope of applicability of these systems are presented and compared. To further 
promote this research, the JELIA 2004 technical programme also included a spe- 
cial session devoted to presentations and demonstrations of 15 implementations, 
selected among 25 submissions, for solving or addressing problems of importance 
to areas in the scope of JELIA. 

We would like to thanks the authors of all the 169 contributions that were 
submitted to JELIA 2004, the members of the Program Committee and the 
additional experts who helped on the reviewing process, for contributing and 
ensuring the high scientific quality of JELIA 2004. 



September 2004 



Jose Julio Alferes 
Joao Leite 




Conference Organization 



The conference was organized by the Departamento cle Informatica, Faculdade 
de Ciencias e Tecnologia, Universidade Nova de Lisboa, under the auspicies of 
the Portuguese Association for AI, APPIA. 

Conference Chair 

Joao Leite Universidade Nova de Lisboa, Portugal 

Program Chair 

Jose Julio Alferes Universidade Nova de Lisboa, Portugal 



Program Committee 



Jose Julio Alferes 
Franz Baader 
Clritta Baral 
Salem Benferhat 
Alexander Bochman 
Rafael Bordini 
Gerhard Brewka 
Walter Carnielli 
Luis Farinas del Cerro 
Melrdi Dastani 
James Delgrande 
Jiirgen Dix 
Roy Dyckhoff 
Thomas Eiter 
Patrice Enj albert 
Michael Fisher 
Ulrich Furbach 
Michael Gelfoncl 
Sergio Greco 
James Harlancl 
Joao Leite 
Maurizio Lenzerini 
Nicola Leone 
Vladimir Lifschitz 
Maarten Marx 
John-Jules Meyer 
Bernhard Nebel 



Universidade Nova de Lisboa, Portugal 

TU Dresden, Germany 

Arizona State University, USA 

Universite d’Artois, France 

Holon Academic Institute of Technology, Israel 

University of Durham, UK 

University of Leipzig, Germany 

Universidade Estadual de Campinas, Brazil 

Universite Paul Sabatier, France 

Universiteit Utrecht, The Netherlands 

Simon Fraser University, Canada 

TU Clausthal, Germany 

University of St Andrews, UK 

TU Wien, Austria 

Universite de Caen, France 

University of Liverpool, UK 

University Koblenz-Landau, Germany 

Texas Tech University, USA 

Universita della Calabria, Italy 

Royal Melbourne Institute of Technology, Australia 

Universidade Nova de Lisboa, Portugal 

Universita di Roma “La Sapienza” , Italy 

Universita della Calabria, Italy 

University of Texas at Austin, USA 

Universiteit van Amsterdam, The Netherlands 

Universiteit Utrecht, The Netherlands 

Universitat Freiburg, Germany 




Conference Organization VII 



Helsinki University of Technology, Finland 
Universidad de Malaga, Spain 
Universidad Rey Juan Carlos, Spain 
Universidade Nova de Lisboa, Portugal 
Universite Paul Sabatier, France 
Universiteit Utrecht, The Netherlands 
Universitat Freiburg, Germany 
National Institute of Informatics, Japan 
University of Manchester, UK 
SUNY at Stony Brook, USA 
Imperial College London, UK 
Universita di Bologna, Italy 
University of Kentucky, USA 
University of Minnesota, USA 
University of Liverpool, UK 
University College Cork, Ireland 
University of Technology, Sydney, Australia 
King’s College London, UK 



Ilkka Niemela 
Manuel Ojeda- Aciego 
David Pearce 
Luis Moniz Pereira 
Henri Prade 
Henry Prakken 
Luc de Raedt 
Ken Satolr 
Renate Schmidt 
Terrance Swift 
Francesca Toni 
Paolo Torroni 
Mirek Truszczynski 
Hudson Turner 
Wiebe van der Hoek 
Toby Walsh 
Mary- Anne Williams 
Michael Zakharyasclrev 

Additional Referees 

Aditya Ghose 
Agustin Valverde Ramos 
Alessandra Di Pierro 
Alessandro Provetti 
Alfredo Burrieza Muniz 
Alvaro Freitas Moreira 
A. El Fallah-Segrouchni 
Anatoli Deghtyarev 
Andrea Cali 
Andrea Schalk 
Andreas Herzig 
Anni-Yasmin Turhan 
Antonino Rotolo 
Antonio C. Rocha Costa 
Antonis C. Kakas 
Bihn Tran 
Bob Pokorny 
Boris Konev 
Carlos Ivan Chesnevar 
Carlos Uzcategui 
Carsten Fritz 
Carsten Lutz 
Clriaki Sakama 
Claire Lefevre 



Clare Dixon 
Davide Grossi 
Dmitry Tislrkovsky 
Emilia Oikarinen 
Evelina Lamma 
Francesco M. Donini 
Francesco Scarcello 
Franck Dumoncel 
Frangoise Clerin 
Frank Wolter 
Gabriel Aguilera Venegas 
Gerald Pfeifer 
Gerard Becher 
Gerard Vreeswijk 
Giacomo Terreni 
Gianni Amati 
Giovambattista Ianni 
Giuseppe De Giacomo 
Greg Wheeler 
Hans Tompits 
Hans- Jurgen Olrlbach 
Heinrich Wansing 
Herbert Wiklicky 
Hisaslri Hayashi 



Huib Aldewerelcl 
Ian Pratt-Hartmann 
James McKinna 
Jan Broersen 
Jerome Lang 
Jerszy Karczmarczuk 
Jesus Medina Moreno 
J. F. Lima Alcantara 
Joris Hulstijn 
Jiirg Kohlas 
Jurriaan van Diggelen 
Ken Kaneiwa 
Kostas Stathis 
Kristian Kersting 
Laure Vieu 
Leila Amgoud 
Lengning Liu 
Leon van der Torre 
Luigi Palopoli 
Luis da Cunha Lamb 
Lutz Strassburger 
M. Birna van Riemsdijk 
Maartijn van Otterlo 
Manfred Jaeger 




VIII Conference Organization 



Marc Denecker 
Marcello Balduccini 
M. Gago-Fernandez 
Marta Cialdea 
Martin Giese 
Matteo Baldoni 
Maurice Pagnucco 
Michael Winikoff 
Nelson Rushton 
Norbert E. Fuchs 
Ofer Arieli 
Olivier Gasquet 
Pablo Cordero Ortega 
Paola Bruscoli 
Paolo Ferraris 
Paolo Liberatore 

Secretariat 

Filipa Mira Reis 



Pasquale Rullo 
Peep Kiingas 
Peter Baumgartner 
Phan Minh Dung 
Philippe Balbiani 
Philippe Besnard 
Pierangelo Dell’Acqua 
Piero Bonatti 
Ralf Kiisters 
Renata Vieira 
Robert Kowalski 
Roman Kontclrakov 
Roman Schindlauer 
Sergei Odintsov 
Souhila Kaci 
Stefan Woltran 



Silvia Marina Costa 



Organizing Committee 

Antonio Albuquerque Jamshid Ashtari 

Duarte Alvim Joana Lopes 

Eduardo Barros 



Sponsoring Institutions 




FBA 



FCT 

Fuodi^o pm a CUnck e a TecnulogU 

wwrtMo o* etna* r » imnro i>.Ttfio» 



Opensoft 




Stephen Read 
Tiberiu Stratulat 
Tom Kelsey 
Tommi Syrjanen 
Ulle Endriss 
Ullrich Hustadt 
Uwe Waldmann 
Victor Marek 
Vincent Louis 
Vincenzo Pallotta 
Viviana Patti 
Wolfgang Faber 
Youssef Chahir 
Yves Moinard 



Miguel Morais 
Sergio Lopes 



* V 

•l •* 

• •' 

CoLogNET 




fundaqAo 

CALOUSTE 

GULBENKIAN 





Table of Contents 



Invited Talks 

Representing and Reasoning with Preferences 1 

F. Rossi 

Engineering of Logics for the Content-Based Representation 

of Information 2 

F. Baader 

Formal Methods in Robotics 4 

B. Nebel 

Multi-agent Systems 

Games for Cognitive Agents 5 

M. Dastani, L. van der Torre 

Knowledge-Theoretic Properties of Strategic Voting 18 

S. Chopra, E. Pacuit, R. Parikh 

The CIFF Proof Procedure for Abductive Logic Programming 

with Constraints 31 

U. Endriss, P. Mancarella, F. Sadri, G. Terreni, F. Toni 

Hierarchical Decision Making by Autonomous Agents 44 

S. Heymans, D. Van Nieuwenborgh, D. Vermeir 

Verifying Communicating Agents by Model Checking 

in a Temporal Action Logic 57 

L. Giordano, A. Martelli, C. Schwind 

Qualitative Action Theory 

(A Comparison of the Semantics of Alternating-Time Temporal Logic 

and the Kutschera-Belnap Approach to Agency) 70 

S. Wolfi 

Practical Reasoning for Uncertain Agents 82 

N. de C. Ferreira, M. Fisher, W. van der Hoek 

Modelling Communicating Agents in Timed Reasoning Logics 95 

N. Alechina, B. Logan, M. Whitsey 




X 



Table of Contents 



Logic Programming and Nonmonotonic Reasoning 

On the Relation Between ID-Logic and Answer Set Programming 108 

M. Marien, D. Gilis, M. Denecker 

An Implementation of Statistical Default Logic 121 

G.R. Wheeler, C.V. Damasio 

Capturing Parallel Circumscription with Disjunctive Logic Programs .... 134 
T. Janhunen, E. Oikarinen 

Towards a First Order Equilibrium 

Logic for Nonmonotonic Reasoning 147 

D. Pearce, A. Valverde 

Characterizations for Relativized Notions 

of Equivalence in Answer Set Programming 161 

S. Woltran 

Equivalence of Logic Programs Under Updates 174 

K. Inoue, C. Sakama 

Cardinality Constraint Programs 187 

T. Syrjanen 

Recursive Aggregates in Disjunctive Logic Programs: 

Semantics and Complexity 200 

W. Faber, N. Leone, G. Pfeifer 

Reasoning Under Uncertainty 

A Logic for Reasoning About Coherent Conditional Probability: 

A Modal Fuzzy Logic Approach 213 

E. Marchioni, L. Godo 

A Logic with Conditional Probabilities 226 

M. Raskovic, Z. Ognjanovic, Z. Markovic 

Reasoning About Quantum Systems 239 

P. Mateus, A. Sernadas 

Sorted Multi-adjoint Logic Programs: 

Termination Results and Applications 252 

C. V. Damasio, J. Medina, M. Ojeda- Aciego 

Logic Programming 

The Modal Logic Programming System MProlog 266 

L. A. Nguyen 




Table of Contents 



XI 



Soundness and Completeness of an “Efficient” Negation for Prolog 279 

J.J. Moreno- Navarro, S. Muhoz-Hernandez 

Logic Programs with Functions and Default Values 294 

P. Cabalar, D. Lorenzo 

Actions and Causation 

Parallel Encodings of Classical Planning as Satisfiability 307 

J. Rintanen, K. Heljanko, I. Niemeld 

Relational Markov Games 320 

A. Finzi, T. Lukasiewicz 

On the Logic of ‘Being Motivated to Achieve p , Before 5’ 334 

J. Broersen 

Complexity Issues 

Representation and Complexity in Boolean Games 347 

P.E. Dunne, W. van der Hoek 

Complexity in Value-Based Argument Systems 360 

P.E. Dunne, T. Bench- Capon 

A Polynomial Translation from the Two- Variable Guarded Fragment 

with Number Restrictions to the Guarded Fragment 372 

Y. Kazakov 

Description Logics 

Transforming Fuzzy Description Logics 

into Classical Description Logics 385 

U. Straccia 

Computing the Least Common Subsumer 

w.r.t. a Background Terminology 400 

F. Baader, B. Sertkaya, A.-Y. Turhan 

Explaining Subsumption by Optimal Interpolation 413 

S. Schlobach 

Belief Revision 

Two Approaches to Merging Knowledge Bases 426 

J.P. Delgrande, T. Schaub 

An Algebraic Approach to Belief Contraction 

and Nonmonotonic Entailment 439 

L. Flax 




XII 



Table of Contents 



Logical Connectives for Nonmonotonicity: 

A Choice Function-Based Approach 452 

J. Mengin 

On Sceptical Versus Credulous Acceptance 

for Abstract Argument Systems 462 

S. Doutre, J. Mengin 

Modal, Spacial, and Temporal Logics 

Line-Based Affine Reasoning in Euclidean Plane 474 

P. Balbiani, T. Tinchev 

Measure Logics for Spatial Reasoning 487 

M. Giritli 

Only Knowing with Confidence Levels: Reductions and Complexity 500 

E. H. Lian, T. Langholm, , A. Waaler 

Time Granularities and Ultimately Periodic Automata 513 

D. Bresolin, A. Montanari, G. Puppis 

Theorem Proving 

Polynomial Approximations of Full Propositional Logic 

via Limited Bivalence 526 

M. Finger 

Some Techniques for Branch-Saturation in Free- Variable Tableaux 539 

N. Peltier 

Semantic Knowledge Partitioning 552 

C. Wernhard 

Negative Hyper-resolution as Procedural Semantics 

of Disjunctive Logic Programs 565 

L.A. Nguyen 

Applications 

Discovering Anomalies in Evidential Knowledge 

by Logic Programming 578 

F. Angiulli, G. Greco, L. Palopoli 

Logic Programming Infrastructure for Inferences on FrameNet 591 

P. Baumgartner, A. Burchardt 

An Answer Set Programming Encoding 

of Prioritized Removed Sets Revision: Application to GIS 604 

J. Ben-Naim, S. Benferhat, O. Papini, E. Wiirbel 




Table of Contents XIII 



Automatic Compilation of Protocol Insecurity Problems 

into Logic Programming 617 

A. Armando, L. Compagna, Y. Lierler 

Exploiting Functional Dependencies 

in Declarative Problem Specifications 628 

M. Cadoli, T. Mancini 

Combining Decision Procedures for Sorted Theories 641 

C. Tinelli, C.G. Zarba 

Meta-level Verification of the Quality of Medical Guidelines 

Using Interactive Theorem Proving 654 

A. Hommersom, P. Lucas, M. Balser 

Towards a Logical Analysis of Biochemical Pathways 667 

P. Doherty, S. Kertes, M. Magnusson, A. Szalas 

Systems Session 

Abductive Logic Programming with CIFF: System Description 680 

U. Endriss, P. Mancarella, F. Sadri, G. Terreni, F. Toni 

The DALI Logic Programming Agent-Oriented Language 685 

S. Costantini, A. Tocchio 

Qsmodels: ASP Planning in Interactive Gaming Environment 689 

L. Padovani, A. Provetti 

A System with Template Answer Set Programs 693 

F. Calimeri, G. Ianni, G. Ielpa, A. Pietramala, M.C. Santoro 

New DLV Features for Data Integration 698 

F. Calimeri, M. Citrigno, C. Cumbo, W. Faber, N. Leone, S. Perri, 

G. Pfeifer 

Profiling Answer Set Programming: 

The Visualization Component of the noMoRe System 702 

A. Bosel, T. Linke, T. Schaub 

The PLP System 706 

T. Wakaki, K. Inoue, C. Sakama, K. Nitta 

The MyYapDB Deductive Database System 710 

M. Ferreira, R. Rocha 

InterProlog: Towards a Declarative Embedding of Logic Programming 

in Java 

M. Calejo 



714 




XIV Table of Contents 



IndLog — Induction in Logic 718 

R. Camacho 

OLEX - A Reasoning-Based Text Classifier 722 

C. Cumbo, S. Iiritano, P. Rullo 

Verdi: An Automated Tool for Web Sites Verification 726 

M. Alpuente, D. Ballis, M. Falaschi 

SATMC: A SAT-Based Model Checker for Security Protocols 730 

A. Armando, L. Compagna 

tabeql: A Tableau Based Suite for Equilibrium Logic 734 

A. Valverde 

tascpl: TAS Solver for Classical Propositional Logic 738 

M. Ojeda-Aciego, A. Valverde 

Author Index 743 




Representing and Reasoning with Preferences* 



Francesca Rossi 

University of Padova, Italy 
frossiOmath.unipd. it 



Many problems in AI require us to represent and reason about preferences. You 
may, for example, prefer to schedule all your meetings after 10am. Or you may 
prefer to buy a faster computer than one with a larger disk. 

In this talk, I will describe various formalisms proposed for representing pref- 
erences. More precisely, I will talk about soft constraints [1] and CP nets [2], 
which are, respectively, quantitative and qualitative formalisms to handle pref- 
erences. 

I will then discuss how we can reason about preferences, possibly in the 
presence of both hard and soft constraints [3] . In this line of work, I will show 
how CP nets can be paired to a set of hard or soft statements and how the best 
solutions according to given modelling of the preferences can be obtained. 

I will also consider preference aggregation in the context of multi agent sys- 
tems, I will propose several semantics for preference aggregation based on voting 
theory, and I will consider the notion of fairness in this context [4]. Fairness is 
a property which is not possible to obtain (due to Arrow’s impossibility theo- 
rem) if preferences are described via total orders. In our more general context of 
possibly partially ordered preferences, a similar result holds for a class of partial 
orders. 



References 

1. S. Bistarelli, U. Montanan, and F. Rossi. Semiring-based Constraint Solving and 
Optimization. Journal of the ACM, 44(2):201-236, March 1997. 

2. C. Domshlak and R. Brafman. CP-nets - Reasoning and Consistency Testing. Proc. 
KR-02, 2002, pp.121 132. 

3. C. Domshlak, F. Rossi, K.B. Venable, and T. Walsh. Reasoning about soft con- 
straints and conditional preferences: complexity results and approximation tech- 
niques. Proc. IJCAI-03, 2003. 

4. F. Rossi, K. B. Venable, T. Walsh. mCP nets: representing and reasoning with 
preferences of multiple agents. Proc. AAAI 2004, San Jose, CA, USA, July 2004. 



* Joint work with Toby Walsh, Steve Prestwich, and Brent Venable. 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, p. 1, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




Engineering of Logics for the 
Content-Based Representation of Information 



Franz Baader 



Theoretical Computer Science 
TU Dresden 
Germany 

baaderOtcs . inf . tu-dresden. de 



Abstract. The content-based representation of information, which tries 
to represent the meaning of the information in a machine-understandable 
way, requires representation formalisms with a well-defined formal se- 
mantics. This semantics can elegantly be provided by the use of a logic- 
based formalism. However, in this setting there is a fundamental tradeoff 
between the expressivity of the representation formalism and the effi- 
ciency of reasoning with this formalism. This motivates the “engineering 
of logics” , i.e., the design of logical formalisms that are tailored to specific 
representation tasks. The talk will illustrate this approach with the ex- 
ample of so-called Description Logics and their application for databases 
and as ontology languages for the semantic web. 



Storage and transfer of information as well as interfaces for accessing this infor- 
mation have undergone a remarkable evolution. Nevertheless, information sys- 
tems are still not “intelligent” in the sense that they “understand” the informa- 
tion they store, manipulate, and present to their users. A case in point is the 
World Wide Web and search engines allowing to access the vast amount of infor- 
mation available there. Web-pages are mostly written for human consumption 
and the mark-up provides only rendering information for textual and graphical 
information. Search engines are usually based on keyword search and often pro- 
vide a huge number of answers, many of which are completely irrelevant, whereas 
some of the more interesting answers are not found. In contrast, the vision of a 
“Semantic Web” [4] aims for machine-understandable web resources, whose con- 
tent can then be comprehended and processed both by automated tools, such as 
search engines, and by human users. 

The content-based representation of information requires representation for- 
malisms with a well-defined formal semantics since otherwise there cannot be 
a common understanding of the represented information. This semantics can 
elegantly be provided by a translation into an appropriate logic or the use of 
a logic-based formalism in the first place. This logical approach has the addi- 
tional advantage that logical inferences can then be used to reason about the 
represented information, thus detecting inconsistencies and computing implicit 
information. However, in this setting there is a fundamental tradeoff between the 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, pp. 2-3, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




Engineering of Logics for the Content-Based Representation of Information 



3 



expressivity of the representation formalism on the one hand, and the efficiency 
of reasoning with this formalism on the other hand [8] . 

This motivates the “engineering of logics”, i.e., the design of logical for- 
malisms that are tailored to specific representation tasks. This also encompasses 
the formal investigation of the relevant inference problems, the development of 
appropriate inferences procedures, and their implementation, optimization, and 
empirical evaluation. 

The talk will illustrate this approach with the example of so-called Descrip- 
tion Logics [1] and their application for conceptual modeling of databases [6,5] 
and as ontology languages for the Semantic Web [2,3,7]. 

References 

1. Franz Baader, Diego Calvanese, Deborah McGuinness, Daniele Nardi, and Peter F. 
Patel-Schneider, editors. The Description Logic Handbook: Theory, Implementation, 
and Applications. Cambridge University Press, 2003. 

2. Franz Baader, Ian Horrocks, and Ulrike Sattler. Description logics for the semantic 
web. KI Kiinstliche Intelligenz, 4, 2002. 

3. Franz Baader, Ian Horrocks, and Ulrike Sattler. Description logics. In Steffen 
Staab and Rudi Studer, editors, Handbook on Ontologies, International Handbooks 
in Information Systems, pages 3-28. Springer- Verlag, Berlin, Germany, 2003. 

4. T. Berners-Lee, J. Hendler, and O. Lassila. The semantic Web. Scientific American, 
284(5) :34-43, 2001. 

5. Alex Borgida, Maurizio Lenzerini, and Riccardo Rosati. Description logics for 
databases. In [1], pages 462-484. 2003. 

6. Enrico Franconi and Gary Ng. The i.com tool for intelligent conceptual modeling. 
In Proc. of the 7th Int. Workshop on Knowledge Representation meets Databases 
(KRDB 2000), pages 45-53, 2000. 

7. Ian Horrocks, Peter F. Patel-Schneider, and Frank van Harmelen. From SHIQ and 
RDF to OWL: The making of a web ontology language. Journal of Web Semantics, 
1 ( 1) :7 — 26, 2003. 

8. Hector J. Levesque and Ron J. Brachman. A fundamental tradeoff in knowledge 
representation and reasoning. In Ron J. Brachman and Hector J. Levesque, editors, 
Readings in Knowledge Representation, pages 41-70. Morgan Kaufmann, Los Altos, 
1985. 




Formal Methods in Robotics 



Bernhard Nebel 

Albert-Ludwigs-Universitat Freiburg 

AI research in robotics started out with the hypothesis that logical modelling 
and reasoning plays a key role. This assumption was seriously questioned by 
behaviour-based and “Nouvelle AI” approaches. The credo by this school of 
thinking is that explicit modelling of the environment and reasoning about it 
is too brittle and computationally too expensive. Instead a purely reactive ap- 
proach is favoured. 

With the increase of computing power we have seen over the last two decades, 
the argument about the computational costs is not really convincing any more. 
Furthermore, also the brittleness argument ceases to be convincing, once we 
start to incorporate probabilities and utilities. I will argue that it is indeed 
feasible to use computation intensive approaches based on explicit models of the 
environments to control a robot - and achieve competitive performance. 

Most of the time one has to go beyond purely logical approaches, though, 
because it is necessary to be better than an opponent. For this reason, decision 
theory and game theory become important ingredients. However, purely logical 
approaches can have its place if we want to guarantee worst-case properties. I 
will demonstrate these claims using examples from our robotic soccer team, our 
foosball robot and our simulated rescue agent team. 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, p. 4, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




Games for Cognitive Agents 



Mehdi Dastani 1 and Leendert van der Torre 2 

1 Utrecht University mehdiOcs .uu.nl 
2 CWItorre@cwi.nl 



Abstract. Strategic games model the interaction among simultaneous decisions 
of agents. The starting point of strategic games is a set of players (agents) having 
strategies (decisions) and preferences on the game’s outcomes. In this paper we 
do not assume the decisions and preferences of agents to be given in advance, but 
we derive them from the agents' mental attitudes. We specify such agents, define 
a mapping from their specification to the specification of the strategic game they 
play. We discuss a reverse mapping from the specification of strategic games that 
agents play to a specification of those agents. This mapping can be used to specify 
a group of agents that can play a strategic game, which shows that the notion of 
agent system specification is expressive enough to play any kind of game. 



1 Introduction 

There are several approaches in artificial intelligence, cognitive science, and practical 
reasoning (within philosophy) to the decision making of individual agents. Most of these 
theories have been developed independently of classical decision theory based on the 
expected utility paradigm (usually identified with the work of Neumann and Morgenstern 
[11] and Savage [9]) and classical game theory. In these approaches, the decision making 
of individual autonomous agents is described in terms of other concepts than maximizing 
utility. For example, since the early 40s there is a distinction between classical decision 
theory and artificial intelligence based on utility aspiration levels and goal based planning 
(as pioneered by Simon [10]). Qualitative decision theories have been developed based 
on beliefs (probabilities) and desires (utilities) using formal tools such as modal logic 
[ 1 ] . Also, these beliefs-desires models have been extended with intentions or BDI models 
[3,8]. Moreover, in cognitive science and philosophy the decision making of individual 
agents is described in terms of concepts from folk psychology like beliefs, desires and 
intentions. In these studies, the decision making of individual agents is characterized in 
terms of a rational balance between these concepts, and the decision making of a group 
of agents is described in terms of concepts generalized from those used for individual 
agents, such as joint goals, joint intentions, joint commitments, etc. Moreover, new 
concepts are introduced at this social level, such as norms (a central concept in most 
social theories). We are interested in the relation between AI theories of decision making, 
and their classical counterparts. 

We introduce a rule based qualitative decision theory for agents with beliefs and 
desires. Like classical decision theory but in contrast to several proposals in the BDI 
approach [3,8], the theory does not incorporate decision processes, temporal reasoning, 
and scheduling. We also ignore probabilistic decisions. In particular, we explain how 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, pp. 5-17, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




6 



M. Dastani and L. van der Torre 



decisions and preferences of individual agents can be derived from their beliefs and 
desires. We specify groups of agents and discuss the interaction between their decisions 
and preferences. The problems we address are: 1) How can we map the specification of 
the agent system to the specification of the strategic game that they play? This mapping 
considers agent decisions as agent strategies and decision profiles (a decision for each 
agent) as the outcomes of the strategic game. 2) How can we map the specification of a 
strategic game to the specification of the agent system that plays the game? This mapping 
provides the mental attitudes of agents that can play a strategic game. We show that the 
mapping which is composed by a mapping from the specification of a strategic games to 
the specification of an agent system and back is the identity relation, while the mapping 
composed of a mapping from the specification of an agent system to the specification of 
a strategic game and back is not necessarily the identity relation. 

The layout of this paper is as follows. In section 2 we introduce the rule based 
qualitative decision theory. In section 3 we define a mapping from the specification of 
the agent system to the specification of the strategic game they play. In section 4, we 
discuss the reverse mapping from the specification of a strategic game to the specification 
of the agent system that plays the game. 

2 Agents, Decisions, and Preferences 

The specification of agent systems introduced in this section is developed for agents that 
have conditional beliefs and desires. The architecture and the behavior of this type of 
agent is studied in [2], Here, we analyze this type of agent from a decision and game 
theoretic point of view by studying possible decisions of individual agents and the in- 
teraction between these decisions. We do so by defining an agent system specification 
that indicates possible decisions of individual agents and possible decision profiles (i.e., 
multiagent decisions). We show how we can derive agent decision profiles and prefer- 
ences from an agent system specification. In this section, we first define the specification 
of multiagent systems. Then, we study possible and feasible individual and multiagent 
decisions within an agent system specification. Finally, we discuss agents’ preference 
ordering defined on the set of individual and multiagent decisions. 

2.1 Agent System Specification 

The starting point of any theory of decision is a distinction between choices made by the 
decision maker and choices imposed on it by its environment. For example, a software 
upgrade agent (decision maker) may have the choice to upgrade a computer system at 
a particular time of the day. The software company (the environment) may in turn al- 
low/disallow such an upgrade at a particular time. Therefore, we assume n disjoint sets 
of propositional atoms A = A\ U . . . U A n with typical elements a. b. c, . . . (agents’ de- 
cision variables [6] or controllable propositions [1]) and a set of propositional atoms W 
with typical elements p,q,r,... (the world parameters or uncontrollable propositions) 
such that A fl W = 0. In the sequel, the propositional languages that are built up from 
Ai , A, IT? and A U W atoms are denoted by L A t , La, L\y, and Law - respectively. Fi- 
nally, we use variables x, y, ... to stand for any sentences of the languages , La, Lw, 

and Law- 




Games for Cognitive Agents 



7 



An agent system specification given in Definition 1 contains a set of agents and 
for each agent a description of its decision problem. The agent’s decision problem is 
defined in terms of its beliefs and desires, which are formalized as belief and desire 
rules, a preference ordering on the powerset of the set of desire rules, a set of facts, 
and an initial decision (or prior intentions). An initial decision reflects that the agent 
has already made a decision (intention) in an earlier stage. One may argue that it is not 
realistic to define the preference ordering on the power set of the set of desire rules 
since this implies that for each agent its preference on all combinations of its individual 
desires should be specified beforehand. As it is explained elsewhere [4,5], it is possible 
to define the preference ordering on the set of desire rules and then lift this ordering to 
the powerset of the set of desire rules. We have chosen to define the preference ordering 
on the powerset of the set of desire rules to avoid additional complexity which is not 
related to the main focus of this paper. The preference ordering is also assumed to be 
a preorder (i.e. reflexive, transitive, and complete). Again, although this assumption is 
quite strong for realistic applications, we can use it since the main claim of this paper is 
not a theory for realistic applications. Finally, we assume that agents are autonomous, 
in the sense that there are no priorities between desires of distinct agents. 

Definition 1 (Agent system specification). An agent system specification is a tuple 
AS = ( S , F, B , D, >, A 0 ) that contains a set of agents S = {a \, . . . , a n }, and for 
each agent a % a finite set of facts Fi C L\y ( F = (F ±, . . . , F n )), a finite set of belief 
rules Bi C Law x Lw (B = {B\, . . . , B n )), a finite set of desire rules Di C Law x 
Law (L> = (£>!, . . . , D n )), a relation >j on the powerset of D iy i.e. >j C Pow(Df) x 
Pow(Di) (>= (>i, . . . , >„)) which is reflexive, transitive, and complete, and a finite 
initial decision A° C L A t (A 0 = (A?, . . . , A„}). 

In general, a belief rule is an ordered pair x => y with x £ Law and y £ Lw ■ This 
belief rule should be interpreted as ‘the agent believes y in context x’.A desire rule is an 
ordered pair x => y with x £ Law and y £ Law ■ This desire rule should be interpreted 
as ‘the agent desires y in context x’. It implies that the agent’s beliefs are about the world 
(x => p), and not about the agent’s decisions. These beliefs can be about the effects of 
decisions made by the agent (a => p ) as well as beliefs about the effects of parameters 
set by the world (p => q). Moreover, the agent’s desires can be about the world (x => p, 
desire-to-be), but also about the agent’s decisions (x => a, desire-to-do). These desires 
can be triggered by parameters set by the world (p => y) as well as by decisions made 
by the agent ( a => y). Modelling mental attitudes such as beliefs and desires in terms of 
rules can be called modelling conditional mental attitudes [2]. 

2.2 Agent Decisions 

In the sequel we consider each agent from an agent system specification as a decision 
making agent. A decision A of the agent on is any consistent subset of La, that contains 
the initial decision A°. 

Definition 2 (Decisions). Let AS 1 = ( S , F, B, I), >, A 0 ) be an agent system specifi- 
cation, LAi be the propositional language built up from and \=Ai be satisfiability 
in propositional logics LA t ■ An AS decision A is a decision of the agent cn such that 
A- C A C Lai & A \f=Ai-L. The set of possible decisions of agent a, is denoted by Ap 




M. Dastani and L. van der Torre 



The set of possible decisions Aj of an agent a,: contains logically equivalent decisions. 
Two decisions A, A' £ A t are logically equivalent, denoted as A = A', if and only if for 
all models M of : M \= A iff M |= A', where M \= A iff Vx £ A M |= x. 

Definition 3 (Non-equivalent Decisions). Let Aj be the set of possible decisions of 
agent a^ A set of possible logically non-equivalent decisions of agent «j, denoted as 
A it is a subset of Aj such that: VA £ Ai 3 A' £ A t A = A' & VA, A' £ Ai A ^ A'. 

The decisions of an agent depend on the believed consequences of those decisions. The 
consequences are generated by applying its belief rules to its input facts together with 
those decisions. In our framework the decisions are formalized based on the notion of 
extension. 

Definition 4 (Belief Extension). Let Cua, Cnw and Cuaw be the consequence sets 
for theories from La, L\y, and Law > respectively, and | —a, \=w a nd \=aw be satisfia- 
bility, in propositional logics La, Lw, and Law, respectively. Let Bi be a set of belief 
rules of agent a j, A £ A, be one of its possible decisions, and Fi C La be its set of 
facts. The belief consequences of Fi U A j of agent ai are: BfiFi U A,) = {y | x => y £ 
Bi , x & Fi U Ai} and the belief extension of Fi U A j is the set of the consequents of the 
iteratively B t -applicable rules: E Bi (Fi U Ai) = ClFiUXiCX.BiiCnAwix^cx X - 

We give some properties of the belief extension of facts and possible decisions in Def- 
inition 4. First note that E Bi (Fi U Ai) is not closed under logical consequence. The 
following proposition shows that E Bi (Fi U A, ) is the smallest superset of Fi U Aj closed 
under the belief rules Bi interpreted as inference rules. 

Proposition 1. Let E Bi (Fi U A,) = Fi U A i and E J B .(Fi U A,) = E l B ) 1 (Fi U Ai) U 
Bi(Cn A w {E 3 Bi 1 (Fi U Ai))) for j > 0. We have E Bi (Fi U Ai) = U°L 0 E 3 Bi (Fi U A,). 

The following example illustrates that extensions can be inconsistent. 

Example 1. Let Bi = {T => p,a => ->p}, i 7 ) = 0, and A.j = {a}, where T stands for 
any tautology like p V ~>p. We have E Bj (0) = {p} and E Bi (Fi U Ai) = {a,p, ~>p}, 
which means that the belief extension of F, U A; is inconsistent. 

Although decisions with inconsistent belief consequences are not feasible decisions, we 
consider them, besides decisions with consistent consequences, as possible decisions. 
Feasible decisions are defined by excluding decisions that have inconsistent belief con- 
sequences. 

Definition 5 (Feasible Decisions). Let AS = ( S , F, B, D, >, A 0 ) be an agent system 
specification, Ai and Ai be the set of possible decisions and a set of possible logically 
non-equivalent decisions for agent ai £ S, respectively. The set of feasible decisions of 
agent at, denoted by A{, is the subset of its possible decisions Ai that have consistent 
belief consequences, i.e., A{ = (A,; | Aj £ Aj & E Bi (Fi U Aj) is consistent }. A set of 
logically non-equivalent feasible decisions of agent a B denoted by A[, is the subset of 
a set of possible non-equivalent decisions Aj that have consistent belief consequences, 
i.e. A i = {Aj | Aj £ Aj & E Bi (Fi U Aj) is consistent }. 

The following example illustrates the decisions of a single agent. 




Games for Cognitive Agents 



9 



Example 2. Let A\ = {a,b,c,d}, W = {p,q} and AS = (S, F, B, D,>, \°) with 
S = {ai}, Fi = 0, Bi = {b => q, c => p, d => ~<p}, D\ = {b => p,d => -> q}, 
>i = 0 < {b => p} < {d => — >r/} < {fo => p, d => -'<?}, and A? = {a}. Note that 
the consequents of all Bi rules are sentences of L w . We have due to the definition of 
Eb 1 (Fi U Ai), for example, the following logically non-equivalent decisions. 
%(fiU{(t}) = {«}, E Bl (Fi U {a, 6}) = {a, b,q}, 

Eb^FxU {a,c}) = {a,c,p}, E Bi (F 1 U {a, d}) ={a,d,-ip}, 

E Bl (Fi Li {a, b, c}) *== {a, b, c,p, q}, E Bl (F 1 0{a,b,d}) = {a,b,d,-<p,q}, 

E Bi (Fx U {a, c, d}) = {a, c, d,p, ~<p}, E Bl ( F i U {a,b,c,d}) = {a,b,c, d,p,-<p,q}, . . . 

Therefore {a, c, d} and {a, b, c, d} are infeasible AS decisions, because their belief 
extensions are inconsistent. Continued in Example 4. 



2.3 Multiagent Decisions 

In the previous subsection, we have defined the set of decisions, sets of logically non- 
equivalent decisions, the set of feasible decisions, and sets of logically non-equivalent 
feasible decisions for one single agent. In this section, we concentrate on multiagent deci- 
sions, which are also called decision profiles, and distinguish various types of multiagent 
decisions. 

Definition 6. Let AS = (S. F. B. 1), >, A 0 ) be an agent system specification where 
S = {cri, . . . , ctri}. Let also Ai and Ai be the set of possible decisions and a set of 
logically non-equivalent decisions for agent ai £ S, respectively. The set of possible 
decision profiles and a set_of logically jion-equivalent AS decision profiles are A = 
di x ... x A n and A = A\ x . . . x A n , respectively. An AS decision profile (i.e., a 
multiagent decision ) A is a tuple (Ai, . . . , \ n ), where A i £ Aifor 1 < i < n. 

According to definition 5, the feasibility of decisions of individual agents is formulated 
in terms of the consistency of the extension that is calculated based on the decision and 
its own facts and beliefs. In a multiagent setting the feasibility of decisions of a single 
agent depends also on the decisions of other agents. For example, if an agent decides to 
open a door while another agent decides to close it, then the combined decision can be 
considered as infeasible. In order to capture the feasibility of multiagent decisions, we 
consider the feasibility of decision profiles which depends on whether agents’ beliefs, 
facts, and decisions are private or public. 

Definition 7. Let AS = (S, F, B , l). >, A 0 ) where S = (o i , . . . , a n } . A decision pro- 
file A = (Ai, . . . , A n ) is feasible if E B (F U A) is consistent. Below, eight ways to 
calculate E B (F U A) are distinguished. 

1. E Bl u...uB n (Fi U Ai U . . . F n U A„), i.e., public beliefs, facts, and decisions 
2 ■ U» E Bl u...ub„ (Fi U . . . F n U A;), i.e., public beliefs and facts, private decisions 

3- Ui E b iu...ub„ (Fi U Ai U . . . U A n ), i.e., public beliefs and decisions, private facts 

4 - Ui -^BiU...UB„ (Fi U Ai), i.e., public beliefs, private facts and decisions 

5. Ui E Bi (F\ U Ai U . . . F n U A„), i.e., public facts and decisions, private beliefs 

6. Ui E Bi (Fi U Ai U . . . U A„), i.e., public decisions, private beliefs and facts 

7. Ui E Bi (F\ U ... F„ U Ai), i.e., public facts, private beliefs and decisions 

8. Ui E Bi (Fi U Aj), i.e., private beliefs, facts, and decisions 




10 



M. Dastani and L. van der Torre 



Given one of these definitions of Eb{F U A), the set of feasible decisions and a set of 
logically non-equivalent feasible decisions profiles are denoted by A f and Af respec- 
tively. 

Another way to explain these definitions of the feasibility of decision profiles is in 
terms of communication between agents. The agents communicate their beliefs, facts, 
or decisions and through this communication decision profiles become infeasible. The 
following example illustrates the feasibility of decision profiles according to the eight 
variations. 

Example 3. Let Ai = {a}, A 2 = {6}, W = {p} and AS = ({aii, a 2 }, F, B , D, >, A 0 ) 
with Fi = F 2 = 0 , Bi = {b => p}, B 2 = {a => ~<p}, D% = D 2 = 0 , > is the 
universal relation, and A^ = A° = 0 . Note that the only belief of each agent is about 
the consequence of the decisions that can be taken by the other agent. The following 
four AS decision profiles are possible; the numbers associated to the following decision 
profiles indicate according to which definitions of Eb{F U A) the decision profile A is 
feasible: ( 0 , 0 ) : 1 ... 8 ({a}, 0 ) : 1 . . . 8 ( 0 , {&}) : 1 . . . 8 ({a}, {&}) : 7, 8 

Since the consequence of decisions that can be taken by each agent is captured by the 
belief of the other agent, the decision profile (Ai, A 2 ) = ({a}, {6}) is only feasible when 
the two agents do not communicate their beliefs and decisions, i.e., 

7. E b (F U A) = E Bl (Fi U F 2 U Ar) U E Ba (F 1 Uf 2 U A 2 ) = {a} {J{b} = {a, b}. 

8. E b {F U A) = E Bl ( F 1 U Ar) U E B2 {F 2 U A 2 ) = {a} {J{b} = {a, b}. 

In all other cases, the decision profile ({a}, {&}) will be infeasible since Eb{F U A) = 
{a,p, b, -ip} is inconsistent. 

In general, agents may communicate their beliefs, facts, and decisions only to some, but 
not all, agents. The set of agents to which an agent communicates its beliefs, facts, and 
decisions depend on the communication network between agents. In order to define the 
feasibility of decision profiles in terms of communication, one need to restrict the union 
operators in various definition of Eb (F U A) to subsets of agents that can communicate 
with each other. In this paper, we do not study this further requirement. 

Various definitions of Eb{F U A), as proposed in definition 7, are related to each 
other with respect to the public-/privateness of beliefs, facts, and decisions. 

Definition 8. A definition T> of Eb{F U A), as given in definition 7, is more public than 
another definition T>' of Eb{F U A), written as T>' C p T>, if and only if all aspects (i.e., 
beliefs, facts, and decisions ) that are public in T>' are also public in T>. 

The definition results a lattice structure on the eight definitions of Eb(F U A), as illus- 
trated in Figure 1. The top of the lattice is the definition of Eb{F U A) according to 
which beliefs, facts, and decisions of agents are communicated to each other, and the 
bottom of the lattice is the definition of Eb{FU A) according to which beliefs, facts, and 
decisions of agents are not communicated. This lattice shows that the more-public-than 
relation is the same as subset relation on the public aspects. 

Proposition 2. Let V and V be two definitions ofEs{F U A) such that V C p V. The 
feasibility of decision profiles persists under the C p relation, i. e. , for all decision profiles 
A and X 1 if the decision profile A is feasible w.r.t. the definition T>, then it is also feasible 
w.r.t. the definition T>' . 




Games for Cognitive Agents 



11 



F ‘ Lt I U...LIU rt (Fl l (. A xt ) 




This proposition states that if a decision is feasible when aspects are public, the decision 
remains feasible when public aspects become private. The following proposition states 
that communication is relevant for the feasibility of decisions only if the agent system 
specification consists of more than one agent. 

Proposition 3. Let AS = (S, F, B 1 1). >, A 0 ) be an agent system specification where 
| S' | = 1, i.e., there exists only one agent. If a decision A is feasible according one 
definition of Eb{F U A), then it is feasible according to all definitions of Eb{F U A). 

In this sequel, we proceed with the definition of Eb (F U A) where agents’ facts, beliefs, 
and decisions are assumed to be private, i.e., Eb{F U A) = (J ■ (Ft U Aj). 

2.4 Agent Preferences 

In this section we introduce a way to compare decisions. Decisions are compared by sets 
of desire rules that are not reached by the decisions. A desire x => y is unreached by a 
decision if the expected consequences of the decision imply x but not y 1 . 

Definition 9 (Comparing decision profiles). Let AS = {S, F, B, D, >, A 0 ), A be a 
AS decision profile, and Eb(F U A) = (J- EsfiFi U Aj). The unreached desires of A 
for agent on are: Ui(X) = {x =>■ y € Di | Eb(F U A) |= x and Eb(F UA) ^ y}. 
Decision profile A is at least as good as decision profile X' for agent on, written as 
X X', ijfUi(X') >i Ui(X). Decision profile X is equivalent to decision profile X' for 
agent a j, written as X A', iff X >,f A' and X' >f X. Decision profile X dominates 
decision profile X' for agent on, written as X >\ T A', iffX >f A' and X' fF X. 

Thus, a decision profile A is preferred by agent on to decision profile X! if the set of 
unreached desire rules by A is less preferred by on to the set of unreached desire rules 
by A'. Note that the set of unreached desire rules for an infeasible decision is the whole 
set of desire rules such that all infeasible decision profiles are equally preferred for each 
agent. The following continuation of Example 2 illustrates the comparison of decisions. 



1 The comparison can also be based on the set of violated or reached desires. The desire rule is 
violated or reached if these consequences imply x A ~>y or x A y, respectively. 




12 



M. Dastani and L. van der Torre 



Example 4 (Continued). In example 2, the agent system specification consists of only 
one agent such that each decision profile consists only of decisions of one agent. Below 
are some decision profiles and their corresponding sets of unreached desire rules: 

^((W)) = 0, t/i(<{a, b })) = {b=> p }, U\(({a, c})) = 0, 

Ui(({a,d})) = {d => ->q}, Ui(({a,b,c})) = 0, Ui(({a,b,d})) = {b => p , d =>■ ->g}, 

Ui{{{a, c, d})) = 0, Ui(({a, b, c, d})) = 0, 

We thus have for example that the decision profile ({a, c}) dominates the decision 
profile ({a, b}), ({a, b}) dominates ({as, d}), and ({a, d}) dominates ({a, 6, d}}, i.e., 
{a,c} >i {a, b} >i {a,d} > J { {a,b,d}. Moreover, ({a}), ({a, c}), and ({a, b, c}) 
are equivalent, i.e., ({a}) ~ ({as, c}) ~ ({a, b, c}). 

In Example 2, the decision profiles ({a, c, d}) and ({a, b , c, d}) are infeasible and there- 
fore have the whole set of desire rules as the set of unreached desire rules. A problem 
with defining the unreached based ordering on infeasible decisions is that the unreached 
based ordering cannot differentiate between some feasible decisions, e.g., ({a, 6, c}), 
and infeasible decisions, e.g., ({a, c, d}). 

The following proposition states that the agent’s ordering on decision profiles based 
on unreached desire rules is a preference ordering. 

Proposition 4. Let AS = (S, F, B, D,>, X°) be an agent system specification. For 
each agent , the ordering on decision profiles based on the set of unreached desire rules 
is a preorder. 



3 From Agent System Specifications to Game Specifications 

In this subsection, we consider interactions between agents based on agent system spec- 
ifications, their corresponding agent decisions, and the ordering on the decisions as ex- 
plained in previous subsections. Agents select optimal decisions under the assumption 
that other agents do likewise. This makes the definition of an optimal decision circular, 
and game theory therefore restricts its attention to equilibria. For example, a decision 
is a Nash equilibrium if no agent can reach a better (local) decision by changing its 
own decision. The most used concepts from game theory are Pareto efficient decisions, 
dominant decisions and Nash decisions. We first repeat some standard notations from 
game theory [7], 

In the sequel, we consider only strategic games, which is described in [7] as follows: 
A strategic game is a model of interactive decision making in which each decision maker 
chooses his plan of action once and for all, and these choices are made simultaneously. 
For strategic games, we use 5j to denote a decision (strategy) of agent a, and 6 = 
(Si,.. . , S n ) to denote a decision (strategy) profile containing one decision (strategy) for 
each agent. We use 5-i to denote the decision profile of all agents except the decision of 
agent on, and (5-i, S'f) to denote a decision profile which is the same as 5 except that the 
decision of agent i from S is replaced with the decision 5'. Finally, we use A to denote 
the set of all possible decision profiles for agents a\, . . . ,a n and A, to denote the set 
of possible decisions for agent a t . 

Definition 10 (Strategic Game Specification [7]). A strategic game specification is a 
tuple (N, A, > 9S ) where N = {ai, . . . , a n } is a set of agents, A C A\ x . . . x A n for 




Games for Cognitive Agents 



13 



Ai is a set of decisions of agent oti, and > 9S = (>f s , . . . , >® s ),/or >f s is the preference 
order of agent on on A. The preference order >® s is reflexive, transitive, and complete. 

We now define a mapping AS2AG from the specification of an agent system to the 
specification of a strategic game. This mapping is based on the correspondence be- 
tween logically non-equivalent feasible decisions from an agent system specification 
and agents’ decisions in strategic game specifications, and between agents’ preference 
orderings in both specifications. We have chosen to make a correspondence based on 
the logically non-equivalent decisions (and not possible decisions) since otherwise the 
resulting strategic game specification contain logically equivalent decisions profiles. 

Definition 11 (AS2AG). Let S = {a i, . . . , a n }, AS = (S, F, B, D, >, A 0 ), A f be a 
set of AS logically non-equivalent feasible decision profiles according to Definition 7 
and for a given definition of Eb{F U A) (i.e., |J v E Bi {Fi U A,)), and >f be the AS 
preference order of agent a. t defined on A? according to definition 9. The strategic game 
specification of AS is GS = ( N , A, > 9S ) if there exists a bijjective function gfrom AS 
to GS such that g : S N, g : A? —y A, and g :> u ^> gs . 



Example 5. Let ca\ and a 2 be two agents, F\ = F 2 = 0, and initial decisions A? = 
A° = 0. They have the following beliefs en desires: B\ = {a => p}, D i = {T => 
P, T => q}, 0 >i {T => q} >i {T => p} >i {T => p, T => q}, B 2 = {b => q}, 
f> 2 = {q => -*p,p => ~ , q}, 0 >2 {q => v } >2 {p => -> q } >2 {p => -> q , => _, p}- Let 

A be feasible decision profile according to any of the eight formulations of Eb (F U A) 
as proposed in definition 7, and UfX) be the set of unreached desires for agent a, and 
decision profile A. 



A E b (F UA) lh( A) U 2 ( A) 

( 0 , 0 ) 0 {T^p,T^g} 0 , 

(0,{6}) {q} {T ^p] {q => ~<p} 

({a},0) {p} {T^<?} {P^^q} 

({o},{&}) {p,q} 0 {p => ~>q, q => “'Pi 

According to definition 9 the preference ordering over possible decision profiles for agent 
a\ is ({a},{&}) >1 ({at}, 0) >Y (0, {6}) >Y (0,0), and for agent a 2 is (0,0) >% 
(0,{6}) >Y ({a},0) >Y ({a}, {&}). Consider now the following mapping from AS to 
GS based on the bijective function g defined as p(a?) = a?: Va,: £ S, g( A) = A VA £ 
A? , and g(>Y) =>Y for i = 1,2. The resulting strategic game specification of AS 
is GS = (N, A,> gs ), where N = {g{af) \ on £ S'}, A = {p(A) | A £ Af}. and 

> 5S = (g(>Y),9(>Y))- 



We now use the mapping from AS to GS and consider different types of decision profiles 
which are similar to types of decision (strategy) profiles from game theory. 

Definition 12. [7] Let Af be a set of logically non-equivalent feasible decision profiles 
that is derived from the AS specification of an agent system and GS be the strategic game 
specification of AS based on the mapping g. A feasible AS decision profile A £ Af is 
Pareto decision ifg(X) = S is a pareto decision in GS, i.e., if there is no S' £ Aforwhich 
8[ >? s Sifor all agents a» £ N. A feasible AS is strongly Pareto decision ifg( A) = 8 
is a strongly Pareto decision in GS, i.e., if there is no 8' £ A for which 8[ >f s Sifor all 




14 



M. Dastani and L. van der Torre 



agents a t and S) >? s Sj for some agents aj. A feasible AS is weak dominant decision 
if g{\) = 5 is a weak dominant decision in GS, i.e., if for all S' € A and for every agent 
a-i it holds: . Si) >f s S'f). A feasible AS is strong dominant decision if for 

all S' € A and for every agent ai it holds: ( S'_ i , Si) >f s S'). Finally, a feasible 
AS is Nash decision if g{ A) = 5 is a Nash decision in GS, i.e., if for all agents a.i it 
holds: ( S-i,Si ) >® s (S-i, S-) for all 5 ' € A- 

It is a well known fact that Pareto decisions exist (for finite games), whereas dominant 
decisions do not have to exist. Consider the strategic game specification GS which is 
derived from the agent system specification AS in Example 5. None of the decision 
profiles in GS are dominant decisions. 

Starting from an agent system specification, we can derive the strategic game specifi- 
cation and in this game specification we can use standard techniques to for example find 
the Pareto decisions. The problem with this approach is that the translation from an agent 
system specification to a strategic game specification is computationally expensive. For 
example, a compact agent representation with only a few belief and desire rules may 
lead to a huge set of decisions if the number of decision variables is high. A challenge of 
qualitative game theory is therefore whether we can bypass the translation to strategic 
game specification, and define properties directly on the agent system specification. For 
example, are there particular properties of agent system specification for which we can 
prove that there always exists a dominant decision for its corresponding derived game 
specification? An example is the case in which the agents have the same individual agent 
specification, because in that case the game reduces to single agent decision making. 
In this paper we do not pursue this challenge, but we consider the expressive power of 
agent specifications. 

4 From Game Specifications to Agent Specifications 

In this section the question is raised whether the notion of agent system specification is 
expressive enough for strategic game specifications, that is, whether for each possible 
strategic game specification there is an agent system specification that can be mapped on 
it. We prove this property in the following way. First, we define a mapping from strategic 
game specifications to agent system specifications. Second, we show that the composite 
mapping from strategic game specifications to agent system specifications and back to 
strategic game specifications is the identity relation. The second step shows that if a 
strategic game specification GS is mapped in step 1 on agent system specification AS 1 , 
then this agent system specification AS 1 can be mapped on GS. Thus, it shows that there 
is a mapping from agent system specifications to strategic game specifications for every 
strategic game specification GS. 

Unlike the second step, the composite mapping from agent system specifications to 
strategic game specifications and back to agent system specifications is not the identity 
relation. This is a direct consequence of the fact that there are distinct agent system 
specifications that are mapped on the same strategic game specification. For example, 
agent system specifications in which the variable names are uniformly substituted by new 
names. The mapping from strategic game specifications to agent system specifications 
consists of the following steps. 1) The set of agents from strategic game specification 




Games for Cognitive Agents 



15 



is the set of agents for the agent system specification. 2) For each agent in the agent 
system specification a set of decision variables is introduced that will generate the set of 
decisions of the agent in the strategic game specification. 3) For each agent in the agent 
system specification a set of desire rules and a preference ordering on the powerset of 
the set of desire rules are introduced, such that they generate the preference order on 
decision profiles for the agent in the strategic game specification. 

According to definition 10, the ingredients of the specification of strategic games 
are agents identifiers, agents decisions, and the preference ordering of each agent on 
decision profiles. For each decision S, £ Ai of each agent on in a strategic game 
specification GS we introduce a separate decision variable di for the corresponding 
agent a,; in the agent system specification AS 1 . The set of introduced decision variables 
for agent a* is denoted by Ai. The propositional language La, , which is based on A,;, 
specifies the set of possible AS 1 decisions for agent a j. However, specifies more AS 

decisions than GS decisions since it can combine decision variables with conjunction 
and negation. In order to avoid additional decisions for each agent, we design the initial 
decisions for each agent such that possible decisions are characterized by one and only 
one decision variable. In particular, the initial decision A* 1 of agent a,; is specified as 
follows: A- 1 = {V fe d k , d -A ~>d' \ d k £ Aj & d ^ d' & d, d' £ A,}. Moreover, the set 
of decision profiles of strategic game specifications is a subset of all possible decision 
profiles, i.e., A C Ai x . . . x A n . This implies that we need to exclude AS decision 
profiles that correspond with the excluded decision profiles in GS. For this reason, we 
introduce belief rules for relevant agents to make excluded decision profile infeasible in 
AS. For example, suppose in a strategic game specification with two agents a\ and a 2 
the decision profile is excluded, i.e., (<5i, 62 ) A , and that di is the introduced decision 
variable for 5i for i = {1, 2}. Then, we introduce a new parameter p £ W and add 
one belief formula for each agent as follows: d\ => p £ B\ & tfe => ~<p £ B 2 - Note 
that in this example the decision profile (di , c^) in AN is not a feasible decision profile 
anymore. 

We use the preference ordering > gs of each agent a,, defined on the set of decision 
profiles A of the strategic game specification, and introduce a set of desire rules Di 
together with a preference ordering >i on the powerset of 1), for agent 0 , . The set 
of desire rules for an agent and its corresponding preference ordering are designed in 
such a way that if the set of unreached desire rules for a decision in the agent system 
specification are more preferred than the set of unreached desire rules for a second 
decision, then the first decision is less preferred than the second one (see definition 9). 

Definition 13. Let GS = (N, A, > gs ) be a strategic game specification where N = 
{cti , . . . , a n } is the set of agents, A C Ai x . . . x A n is the set of decision profiles, > gs = 
(>f s , • • • , >n S ) consists of the preference orderings of agents on A, and V ~>d-i = 
->di V. . . V-idi-i V-idj+i . . . V->d n . Then, AS = (S,$,B,D,>, A 0 ) is the agent system 
specification derived from GS, where B = (B 1 , . . . , B n ),D = {D\, . . . , D n ), A 0 = 
<A?,...,A°>,>= S = N, Ai = {d \ d is a decision variable for 

6 £ Aj}, W = {pi, . . . ,p n } with parameter for each infeasible decision profile, A° = 

{V fc d k , d -A —>d' | d k £ Aj & d d! & d, d' £ Af\, ( d t => p) £ Bi&r,(dj => ->p) £ 
Bj V(i5i, . . . , S n ) $. A (where di £ Ai, dj £ Aj, 1 < i ^ j < n, p £ W, and di, dj 
are the decision variables for Si and Si, respectively), and Di = {di => \J ~>d-i | V<5 = 




16 



M. Dastani and L. van der Torre 



(<5i , . . . , 6 n ) £ A} (where di, d—i are the decision variables for <5, and 6-i, respectively). 
The preference relation > $ defined on Pow(Di) is characterized as follows: 1) s >, 0/or 
all s £ Pow(Di), 2) {di => V {d* => V <5 (where (d-i, df) 

and (d'_ i , d'fj are the decision profiles for <5 and 5', respectively), and 3) s' >$ s for all 
s, s' £ Pow(Di) & |s| < 1 & |s'| > 1. 

In this definition, the set Di is designed in the context of ordering on decisions based 
on unreached desire rules, as it is defined in Definition 9. In particular, for agent a* we 
define for each decision variable a desire rule that will be unreached as the only desire rule 
by exactly one decision profile. This is the construction of Di in this definition. Then, we 
use the reverse of the preference ordering from the strategic game specification, which 
was defined on decision profiles, to order the unreached (singletons) sets of desire rules. 
Since each decision profile has exactly one unreached desire rule, the preference ordering 
>f s on decision profiles can be directly used as a preference ordering on unreached sets 
of desire rules, which in turn is the reverse of the preference ordering >, defined on 
the powerset of the set Di of desire rules. This is realized by the last three items of this 
definition. The first item indicates that any desire rule is more preferred than no desire 
rule 0. The second item indicates that if a decision profile S' is preferred over a decision 
profile S according to the unreached desire rules, then the desire rules that are unreached 
by S are preferred over the desire rules that are unreached by S'. Finally, the last item 
guarantees that the characterized preference ordering >j is complete by indicating that 
all other sets of of desire rules are preferred over the singletons of desire rules (sets 
consisting of only one desire rule) and the empty set of desire rules. 

The following proposition shows that the mapping from strategic game specifications 
to agent system specifications leads to the desired identity relation for the composite 
relation. 

Proposition 5. Let GS be a strategic game specification as in Definition 10. Moreover, 
let AS be the derived agent system specification as defined in Definition 13. The appli- 
cation of the mapping from agent system specification to strategic game specification, 
as defined in Definition 11, maps AS to GS. 

Proof. Assume any particular GS. Construct the AS as above. Now apply Definition 
9. The unreached desires of decision S for agent on are UfiS) = {x => y £ D \ 
Eb(F U S) |= x and Eb(F U 5) Y= y}. The subset ordering on these sets of unreached 
desires reflects exactly the original ordering on decisions. 

The following theorem follows directly from the proposition. 

Theorem 1. The set of agent system specifications with empty set of facts is expressive 
enough for strategic game specifications. 

Proof. Follows from construction in Proposition 5. 

The above construction raises the question whether other sets of agent system specifi- 
cations are complete too, such as for example the set of agent system specifications in 
which the set of desires contains only desire-to-be desires. We leave these questions for 
further research. 




Games for Cognitive Agents 



17 



5 Concluding Remarks 

In this paper we introduce agent system specifications based on belief and desire rules, 
and we show how various kinds of strategic games can be played (depending on whether 
the beliefs, desires and decisions are public or private), and we show how for each 
possible strategic game an agent specification can be defined that plays that game. The 
agent system specification we propose is relatively simple, but the extension of the 
results to more complex agent system specifications seems straightforward. We believe 
that such results give new insights in the alternative theories which are now developed 
in artificial intelligence, agent theory and cognitive science. 

Our work is typical for a line of research knows as qualitative decision theory which 
aims at closing the gap between on the one hand classical decision and game theory, and 
on the other hand alternative theories developed in artificial intelligence, agent theory 
and cognitive science. Our main result is in our opinion not the particular technical 
results of this paper, but their illustration how the classical and alternative theories can 
use each others’ results. Our motivation comes from the analysis of rule-based agent 
architectures, which have recently been introduced. 

There are several topics for further research. The most interesting question is whether 
belief and desire rules are fundamental, or whether they in turn can be represented by 
some other construct. Other topics for further research are the development of an incre- 
mental any-time algorithm to find optimal decisions, the development of computationally 
attractive fragments of the logic, and heuristics of the optimization problem. 



References 

1. C. Boutilier. Toward a logic for qualitative decision theory. In Proceedings of the KR'94, 
pages 75-86, 1994. 

2. J. Broersen, M. Dastani, J. Hulstijn, and L. van der Torre. Goal generation in the BOID 
architecture. Cognitive Science Quarterly. Special issue on 'Desires, goals, intentions, and 
values: Computational architectures’, 2(3-4j:428 — 447, 2002. 

3. P.R. Cohen and H.J. Levesque. Intention is choice with commitment. Artificial Intelligence, 
42:213-261, 1990. 

4. M. Dastani and L. van der Torre. Decisions and games for BD agents. In Proceedings of The 
Workshop on Game Theoretic and Decision Theoretic Agents, Canada, pages 37—43, 2002. 

5. M. Dastani and L. van der Torre. What is a normative goal? Towards Goal-based Normative 
Agent Architectures. In Regulated Agent-Based Systems, Postproceedings of RASTA '02, pages 
210-227. Springer, 2004. 

6. J. Lang. Conditional desires and utilities - an alternative approach to qualitative decision 
theory. In In Proceedings of the European Conference on Artificial Intelligence (ECAI’96), 
pages 318-322, 1996. 

7. Martin J. Osborne and Ariel Rubenstein. A Course in Game Theory. The MIT Press, Cam- 
bridge, Massachusetts, 1994. 

8. A. Rao and M. Georgeff. Modeling rational agents within a BDI architecture. In Proceedings 
of the KR91, pages 473-484, 1991. 

9. L. Savage. The foundations of statistics. Wiley, New York, 1954. 

10. H. A. Simon. The Sciences of the Artificial. MIT Press, Cambridge, MA, 1981. 

11. J. von Neumann and O. Morgenstern. Theory of Games and Economic Behavior. Princeton 
University Press, Princeton, NJ, 1 edition, 1944. 




Knowledge-Theoretic Properties of Strategic 

Voting 



Samir Chopra 1 , Eric Pacuit 2 , and Rohit Pariklr 3 

1 Department of Computer Science 

Brooklyn College of CUNY 
Brooklyn, New York 11210 
schopraOsci .brooklyn. cuny . edu 

2 Department of Computer Science 

CUNY Graduate Center 
New York, NY 10016 
epacuitOcs . gc . cuny . edu 

3 Departments of Computer Science, Mathematics and Philosophy 
Brooklyn College and CUNY Graduate Center 
New York, NY 10016 
ripbc@cunyvm . cuny . edu 



Abstract. Results in social choice theory such as the Arrow and 
Gibbard-Satterthwaite theorems constrain the existence of rational col- 
lective decision making procedures in groups of agents. The Gibbard- 
Satterthwaite theorem says that no voting procedure is strategy-proof. 
That is, there will always be situations in which it is in a voter’s interest 
to misrepresent its true preferences i.e., vote strategically. We present 
some properties of strategic voting and then examine - via a bimodal 
logic utilizing epistemic and strategizing modalities - the knowledge- 
theoretic properties of voting situations and note that unless the voter 
knows that it should vote strategically, and how, i.e., knows what the 
other voters’ preferences are and which alternate preference P' it should 
use, the voter will not strategize. Our results suggest that opinion polls 
in election situations effectively serve as the first n — 1 stages in an n 
stage election. 



1 Introduction 

A comprehensive theory of multi-agent interactions must pay attention to results 
in social choice theory such as the Arrow and Gibbard-Satterthwaite theorems [1, 
7,17]. These impossibility results constrain the existence of rational collective de- 
cision making procedures. Work on formalisms for belief merging already reflects 
the attention paid to social choice theory [9,6,12,11,13]. In this study we turn 
our attention to another aspect of social aggregation scenarios: the role played 
by the states of knowledge of the agents. The study of strategic interactions in 
game theory reflects the importance of states of knowledge of the players. In this 
paper, we bring these three issues — states of knowledge, strategic interaction and 
social aggregation operations — together. 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, pp. 18—30, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




Knowledge-Theoretic Properties of Strategic Voting 



19 



The Gibbard-Sattertlrwaite theorem is best explained as follows 1 . Let S' be a 
social choice function whose domain is an n-tuple of preferences P\ ... P n , where 
{1, ...,n} are the voters, M is the set of choices or candidates and each Pi is a 
linear order over M. S takes Pi . . . P n as input and produces some element of M 
- the winner. Then the theorem says that there must be situations where it ‘prof- 
its’ a voter to vote strategically. Specifically, if P denotes the actual preference 
ordering of voter i, Y denotes the profile consisting of the preference orderings 
of all the other voters then the theorem says that there must exist P, Y, P' such 
that S(P',Y) >p S(P,Y). Here >p indicates: better according to P. Thus in 
the situation where the voter’s actual ordering is P and all the orderings of the 
other voters (together) are Y then voter i is better off saying its ordering is P' 
rather than what it actually is, namely P. In particular, if the vote consists of 
voting for the highest element of the preference ordering, it should vote for the 
highest element of P' rather than of P. 

Of course, the agent might be forced to express a different preference. For 
example, if an agent, whose preferences are B > C > A, is only presented C, A 
as choices, then the agent will pick C. This ‘vote’ differs from the agent’s true 
preference, but should not be understood as ‘strategizing’ in the true sense. 

A real-life example of strategizing was noticed in the 2000 US elections when 
some supporters of Ralph Nader voted for their second preference, Gore, 2 in a 
vain attempt to prevent the election of George W. Bush. In that case, Nader 
voters decided that (voting for the maximal element of) a Gore-Nader-Buslr 
expression of their preferences would be closer to their desired ordering of Nader- 
Gore-Bush than the Bush-Gore-Nader ordering that would result if they voted 
for their actual top choice. Similar examples of strategizing have occurred in 
other electoral systems over the years ([4] may be consulted for further details 
on the application of game-theoretic concepts to voting scenarios) . The Gibbard- 
Sattertlrwaite theorem points out that situations like the one pointed out above 
must arise. 

What interests us in this paper are the knowledge-theoretic properties of the 
situation described above. We note that unless the voter with preference P knows 
that it should vote strategically, and how, i.e., knows that the other voters’ 
preference is Y and that it should vote according to P' ^ P, the theorem is not 
‘effective’. That is, the theorem only applies in those situations where a certain 
level of knowledge exists amongst voters. Voters completely or partially ignorant 
about other voters’ preferences would have little incentive to change their actual 
preference at election time. In the 2000 US elections, many Nader voters changed 
their votes because opinion polls had made it clear that Nader stood no chance 
of winning, and that Gore would lose as a result of their votes going to Nader. 

1 Later we use a different formal framework; we have chosen to use this more trans- 
parent formalism during the introduction for ease of exposition. 

2 Surveys show that had Nader not run, 46% of those who voted for him would have 
voted for Gore, 23% for Bush and 31% would have abstained. Hereafter, when we 
refer to Nader voters we shall mean those Nader voters who did or would have voted 
for Gore. 




20 



S. Chopra, E. Pacuit, and R. Parikli 



We develop a logic for reasoning about the knowledge that agents have of 
their own preferences and other agents’ preferences, in a setting where a social 
aggregation function is defined and kept fixed throughout. We attempt to for- 
malize the intuition that agents, knowing an aggregation function, and hence 
its outputs for input preferences, will strategize if they know a) enough about 
other agents’ preferences and b) that the output of the aggregation function of 
a changed preference will provide them with a more favorable result, one that is 
closer to their true preference. We will augment the standard epistemic modality 
with a modality for strategizing. This choice of a bimodal logic brings with it 
a greater transparency in understanding the states that a voter will find itself 
in when there are two possible variances in an election: the preferences of the 
voters and the states of knowledge that describe these changing preferences. 

Our results will suggest that election-year opinion polls are a way to effec- 
tively turn a one-shot game, i.e., an election, into a many-round game that may 
induce agents to strategize. Opinion polls make voters’ preferences public in an 
election year and help voters decide on their strategies on the day of the election. 
For the rest of the paper, we will refer to opinion polls also as elections. 

The outline of the paper is as follows. In Section 2 we define a formal voting 
system and prove some preliminary results about strategic voting. In Section 3 we 
demonstrate the dependency of strategizing on the voters’ states of knowledge. 
In Section 4 we develop a bimodal logic for reasoning about strategizing in voting 
scenarios. 

2 A Formal Voting Model 

There is a wealth of literature on formal voting theory. This section draws upon 
discussions in [4,5]. The reader is urged to consult these for further details. 

Let O = {oi, . . . , o m } be a set of candidates, A = {1, . . . , n} be a set of 
agents or voters. We assume that each voter has a preference over the elements 
of O , i.e., a reflexive, transitive and connected relation on O. For simplicity 
we assume that each voter’s preference is strict. A voter i’s strict preference 
relation on O will be denoted by P , . We represent each Pi by a function Pi : O — > 
{1, . . . , to}, where we say that a voter strictly prefers Oj to iff Pi(oj) > Pi(ok)- 
We will write Pi = (oi, . . . ,o n ) iff P t (oi) > Pi{o 2 ) > ■■• > Pi{o n ). Henceforth, 
for ease of readability we will use Pref to denote preferences over O. A preference 
profile is an element of (Pref)". Given each agent’s preference an aggregation 
function returns the social preference ordering over O. 

Definition 1 (Aggregation Function). An aggregation function is a func- 
tion from preference profiles to preferences: 

Ag : Pref" — > Pref 

In voting scenarios such as elections, agents are not expected to announce 
their actual preference relation, but rather to select a vote that ‘represents’ their 
preference. Each voter chooses a vote v, the aggregation function tallies the 




Knowledge-Theoretic Properties of Strategic Voting 



21 



votes of each candidate and selects a winner (or winners if electing more than 
one candidate). There are two components to any voting procedure. First, the 
type of votes that voters can cast. For example, in plurality voting voters can only 
vote for a single candidate so votes v are simply singleton subsets of O, whereas 
in approval voting voters select a set of candidates so votes v are any subset of O. 
Following [5], given a set of O of candidates, let B(O) be the set of feasible votes, 
or ballots. The second component of any voting procedure is the way in which 
the votes are tallied to produce a winner (or winners if electing more than one 
candidate). We assume that the voting aggregation function will select exactly 
one winner, so ties are always broken 3 . Note that elements of the set B(0) n 
represent votes cast by the agents. An element v £ B(0) n is called a vote profile. 
A tallying function Ag„ : B(0) n —> O maps vote profiles to candidates. 

Given agent i’s preference Pi, let S(v,Pi ) mean that the vote v is a sincere 
vote corresponding to Pi. For example, in plurality voting, the only sincere vote 
is a vote for the maximally ranked candidate under P,. By contrast, in approval 
voting, there could be many sincere votes, i.e. , those votes where, if candidate o 
is approved, so is any higher ranking o'. Then B(0)i = {u|S'(u, Pi)} is the set of 
votes which faithfully represent i’s preference. The voter i is said to strategize if 
i selects a vote v that is not in the set B(0)i . 

In what follows we assume that when an agent votes, the agent is selecting a 
preference in the set Pref instead of an element of B(O). A vote is a preference; 
a vote profile is a vector of preferences, denoted by P. 4 

Assume that the agents’ true preferences are P* = (P*,... , P*) and fixed 
for the remaining discussion. Given a profile P of actual votes, we ask whether 
agent i will change its vote if given another chance to express its preference. Let 
P-i be the vector of all other agents’ preferences. Then given P_ * and i’s true 
preference P*, there will be a (nonempty) set X, of those preferences that are i’s 
best response to P_,. Suppose that fi(P-i) selects one such best response from 
Xj. 5 Then f(P) = (fi(P-i), ■ ■ ■ , / n (P_„)). We call / a strategizing function. If 
P is a fixed point of / (i.e., f(P) = P), then P is a stable outcome. In other 
words, such a fixed point P of / is a Nash equilibrium. We define /" recursively 
by /' (P) = /(P), /"' = /(/™ _1 (P)), and say that / is stable at level n if 
f n (P)) = / n_1 (P). It is clear that if / is stable at level n, then / is stable at all 
levels m where m > n. Also, if the initial preference of the P is a fixed point of 
/ then all levels are stable. 

Putting everything together, we can now define a voting model. 

Definition 2 (Voting Model). Given a set of agents A, candidates O, a vot- 
ing model is a 5-tuple {A,0,{P*}i^,Ag,f),where P* is voter i’s true pref- 

3 [2] shows that the Gibbard-Satterthwaite theorem holds when ties are permitted. 

4 This does not quite work for approval voting where P does not fully determine the 
sincere vote v, but we will ignore this issue here, as it does not apply in the case of 
plurality elections, whether of one, or of many ‘winners’. 

5 Note that Pi may itself be a member of X t in which case we shall assume that 

f(Pi) = Pi- 




22 



S. Chopra, E. Pacuit, and R. Parikli 



erence; Ag is an aggregation function with domain and range as defined above; 
f is a strategizing function. 

Note that in our definition above, we use aggregation functions rather than 
tallying functions (which pick a winning candidate). This is because we can 
view tallying functions as selecting a ‘winner’ from the output of an aggregation 
function. So in our model, the result of an election is a ranking of the candidates. 
This allows our results to apply not only to conventional plurality voting, but 
also to those situations where more than one candidate is to be elected. They 
require some modification to apply to approval voting, as the ballot is not then 
determined by the preference ordering but also needs a cut-off point between 
approved and ‘dis-approved’ candidates. 

The following example demonstrates the type of analysis that can be modeled 
using a strategizing function. 

Example 1 . Suppose that there are four candidates O = {oi, 02, 03, 04} and five 
groups of voters: A, B, C , D and E. Suppose that the sizes of the groups are 
given as follows: |A| = 40, \B\ = 30, \C\ = 15, \D\ = 8 and \E\ = 7. We assume 
that all the agents in each group have the same true preference and that they 
all vote the same way. Suppose that the tallying function is plurality vote. We 
give the agents’ true preferences and the summary of the four elections in the 
table below. The winner in each round is in boldface. 



Pa = (01,04, 02, 03) 
Pfl = (° 2 , 0 \, 03, 04) 
Pc = (° 3 > °2, ° 4 , 01 ) 
Pd — (° 4 , 01, 02, 03) 
Pe = (° 3 , 01 , 02 , 04) 



Size 


Group 


I 


II 


III 


IV 


40 


A 


Ol 


Ol 


04 


Ol 


30 


B 


02 


02 


°2 


0 2 


15 


C 


03 


02 


o 2 


02 


8 


D 


04 


04 


Ol 


04 


7 


E 


03 


03 


Ol 


Ol 



The above table can be justified by assuming that all agents use the following 
protocol. If the current winner is o, then agent i will switch its vote to some 
candidate o' provided 1) i prefers o' to o, and 2) the current total for o' plus 
agent i’s votes for o' is greater than the current total for o. By this protocol 
an agent (thinking only one step ahead) will only switch its vote to a candidate 
which is currently not the winner. 

In round I, everyone reports their top choice and 01 is the winner. C likes 02 
better than 01 and its own total plus TVs votes for 02 exceed the current votes 
for o\. Hence by the protocol, C will change its vote to 02. A will not change its 
vote in round II since its top choice is the winner. D and E also remain fixed 
since they do not have an alternative like o' required by the protocol. In round 
III, group A changes its vote to 04 since it is preferred to the current winner 
(02) and its own votes plus D’s current votes for 04 exceed the current votes 
for 02. B and C do not change their votes. For H’s top choice 02 is the current 
winner and as for C, they have no o' better than 02 which satisfies condition 2). 





Knowledge-Theoretic Properties of Strategic Voting 



23 



Ironically, Group D and E change their votes to oi since it is prefered to the 
current winner is 02 and group A is currently voting for 01. Finally, in round IV, 
group A notices that E is voting for 01 which A prefers to 04 and so changes its 
votes back to 01. The situation stabilizes with 01 which, as it happens, is also 
the Condorcet winner. 

Much more can be said about the above analysis, but this is a topic for a 
different paper. We now point out that for every aggregation function Ag and 
any strategizing /, there must be instances in which / never stabilizes: 

Theorem 1. For any given tallying function Ag v , there exists an initial vector 
of preferences such that f never stablizes. 

This follows easily from the Gibbard-Satterthwaite theorem. Suppose not, then 
we show that there is a strategy-proof tallying function contradicting the 
Gibbard-Satterthwaite theorem. Suppose that Ag t , is an arbitrary tallying func- 
tion and P* the vector of true preferences. Suppose there always is a level k at 
which / stabilizes given the agents’ true preferences P* . But then define Ag' to be 
the outcome of applying Ag„ to f k {P*) where P* are the agents’ true preferences. 
Then given some obvious conditions on the strategizing function /, Ag' will be 
a strategy-proof tallying function contradicting the Gibbard-Satterthwaite the- 
orem. Hence there must be situations in which f never stabilizes. 

Since our candidate and agent sets are finite, if / does not stabilize then 
/ cycles. We say that / has a cycle of length n if there are n different votes 
P\,...P n such that f {Pi) — Pi+ 1 for all 1 < i < n — 1 and f(P n ) = Pi- 

3 Dependency on Knowledge 

Suppose that agent i knows the preferences of the other agents, and that no 
other agent knows agent V s preference (and agent i knows this). Then i is in 
a very privileged position, where its preferences are completely secret, but it 
knows it can strategize using the preferences of the other agents. In this case, i 
will always know when to strategize and when the new outcome is ‘better’ than 
the current outcome. But if i only knows the preferences of a certain subset B 
of the set A of agents, then there still may be a set of possible outcomes that it 
could force. Since i only knows the preferences of the agents in the set B , any 
strategy P will generate a set of possible outcomes. Suppose that there are two 
strategies P and P' that agent i is choosing between. Then the agent is choosing 
between two different sets of possible outcomes. Some agents may only choose to 
strategize if they are guaranteed a better outcome. Other agents may strategize if 
there is even a small chance of getting a better outcome and no chance of getting 
a worse outcome. We will keep this process of choosing a strategy abstract, and 
only assume that every agent will in fact choose one of the strategies available 
to it. Let Sj be agent i’s strategy choice function, which accepts the votes of a 
group of agents and returns a preference P that may result in a better outcome 
for agent i given the agents report their current preference. We will assume that 




24 



S. Chopra, E. Pacuit, and R. Parikli 



if B = 0 then <S) picks P* . That is, agents will vote according to their true 
preferences unless there is more information. 

As voting takes place or polls reveal potential voting patterns, the facts that 
each agent knows will change. We assume that certain agents may be in a more 
privileged position than other agents. As in [14], define a knowledge graph to 
be any graph with A as its set of vertices. If there is an edge from i to j, then 
we assume that agent i knows agent j’s current vote, i.e., how agent j voted 
in the current election. Let /C = ( A , E£) be a knowledge graph (£*; is the set 
of edges of JC). We assume that i € A knows the current votes of all agents 
accessible from i. Let Bi = {j | there is an edge from i to j in K,}. Then Si 
will select the strategy that agent i would prefer given how agents in Bi voted 
currently. 

We clarify the relationship between a knowledge graph and the existence of a 
cycle in the knowledge graph /C = (A, E/c) by the following: 

Theorem 2. Fix a voting model {A. O, {P* }ieA, Ag, /) and a knowledge graph 
1C = ( A , E/c) ■ If 1C is directed and acyclic then the strategizing function f will 
stabilize at level k, where k is the height of the graph 1C; f can cycle only if the 
associated knowledge graph has a cycle. 

Proof. Since 1C is a directed acyclic graph, there is at least one agent i such that 
Bi = 0. By assumption such an agent will vote according to P* at every stage. 
Let 

A 0 = {i \ i € A and Bi = 0} 

and 

A k = { i | if there is (i, j) £ E/c .then j £ A; for l < k} 

Given (by induction on k) that the agents in Ak - i stabilized by level k — 1, an 
agent i £ Ak need only wait k — 1 rounds, then choose the strategy according to 

S t . □ 

The following is an example of a situation in which the associated strategizing 
function never stabilizes: 

Example 2. Consider three candidates {a, 6, c} and 100 agents connected by a 
complete knowledge graph. Suppose that 40 agents prefer a > b > c (group I), 30 
prefer b > c> a (group II) and 30 prefer c > a > b (group III). If we assume that 
the voting rule is simple majority, then after reporting their initial preferences, 
candidate a will be the winner with 40 votes. The members of group II dislike a 
the most, and will strategize in the next election by reporting c > b > a as their 
preference. So, in the second round, c will win. But now, members of group I 
will report b > a > c as their preference, in an attempt to draw support away 
from their lowest ranked candidate, c will still win the third election, but by 
changing their preferences (and making them public) group I sends a signal to 
group II that it should report its true preference - this will enable group I to 
have its second preferred candidate b come out winner. This cycling will continue 
indefinitely; b will win for two rounds, then a for two rounds, then c for two, etc. 




Knowledge-Theoretic Properties of Strategic Voting 



25 



4 An Epistemic Logic for Voting Models 

In this section we define an epistemic logic for reasoning about voting models. 
In Example 2, it is clear that voters are reasoning about the states of knowledge 
of other voters and furthermore, an agent reasons about the change in states 
of knowledge of other voters on receipt of information on votes cast by them. 
We now sketch the details of a logic /CV for reasoning about knowledge and the 
change of knowledge in a fixed voting model V. 

4.1 The Logic /CV - Syntax 

In this section we will assume that each vote is an expressed preference, which 
may or not be the true preference of an agent. So the expression ‘preference’ 
without the qualifier ‘true’ will simply mean an agent’s current vote. We assume 
that for each preference P there is a symbol P that represents it. There are 
then two types of primitive propositions in C(tCV). First, there are statements 
with the content “agent i’s preference is P” . Let P, represent such statements. 
Secondly, we include statements with the content “P is the current outcome of 
the aggregation function”. Let P o represent such statements. 

Our language includes the standard boolean connectives, an epistemic modal- 
ity K z indexed by each agent i plus an additional modality Oj (similarly indexed) . 
Formulas in C(JCV) take the following syntactic form: 

(j> := p | ~^(t> | (j) A V’ I Ki<t> I 

where p is a primitive proposition, i £ A. We use the standard definitions for 
V, —> and the duals L i: □*. Ki(f) is read as “agent i knows & Oi(j) is read as “after 
agent i strategies, (j) becomes true”. 

4.2 The Logic KV - Semantics 

Before specifying a semantics we make some brief remarks on comparing prefer- 
ences. Strategizing means reporting a preference different from your true pref- 
erence. An agent will strategize if by reporting a preference other than its true 
preference, the outcome is ‘closer’ to its true preference than the outcome it 
would have obtained had it reported its true preference originally. Given pref- 
erences P,Q,R , we use the notation P Qr Q to indicate that P is at least as 
compatible with R as Q is. Given the above ternary relation, we can be more 
precise about when an agent will strategize. Given two preferences P and Q, 
we will say that an agent whose true preference is R prefers P to Q if P C# Q 
holds. That is, i prefers P to Q if P is at least as ‘close’ to i’s true preference as 
Q is. 

We assume the following conditions on C. For arbitrary preferences 
P, Q, R,S: 

1. (Minimality) R Qr P 

2. (Reflexivity) P Qr P 




26 



S. Chopra, E. Pacuit, and R. Parikli 



3. (Transitivity) If P Q and Q S, then P C# S. 

4. (Pareto Invariance) Suppose that R = (or,... , o m ) and P = {o \ , . . . ,o' m ) 
and Q is obtained from P by swapping o' and o' for some i ^ j. If P(o', o' ) 
and P(o',o'), i.e., R agrees with P on o' , o' and disagrees with Q, then P 
must be at least as close to R as Q (P Q). 

(Minimality) ensures that a true preference is always the most desired outcome. 
(Reflexivity) and (Transitivity) carry their usual meanings. As for Pareto invari- 
ance, note that swapping o', o' may also change other relationships. Our example 
below will show that this is not a problem. 

The following is an example of an ordering C# satisfying the above condi- 
tions. Let R = (oi, . . . , o m ). For each vector P, suppose that cp(o,;) is the count 
of Oi in vector P, i.e., the numeric position of o,; numbering from the right. For 
any vector P, let Vr (P) = cp(oi)cp(oi) H — • + cp(o m )cp(o m ). This assigns the 

following value to R, Vr(R) = m 2 + {in — l) 2 -I 1- 1 2 . We say that P is closer 

to R than Q iff Vr(P) is greater than Vr(Q). This creates a strict ordering over 
preferences, which can be weakened to a partial order by composing Vr with a 
weakly increasing function. 

Let V = ( A , 0, {P* }i 6 . 4 , Ag, /) be a fixed voting model. We define a Kripke 
structure for our bi-modal language based on V. States in this structure are 
vectors of preferences 7 together with the outcome of the aggregation function. 
The set of states W is defined as follows: 

W= {(P,0) I P& Pref", Ag{P) = 0} 

Intuitively, given a state (P, O), P represents the preferences that are reported 
by the agents and O is the outcome of the aggregation function applied to P. 
So states of the world will be complete descriptions of stages of elections. 

Our semantics helps clarify our decision to use two modalities. Let (P, O) 
be an element of W. To understand the strategizing modality, note that when 
an agent strategizes it only changes the zth component of P, i.e., the accessible 
worlds for this modality are those in which the remaining components of P are 
fixed. For the knowledge modality note that all agents know how they voted, 
which implies that accessible worlds for this modality are those in which the zth 
component of P remains fixed while others vary. 

We now define accessibility relations for each modality. Since the second 
component of a state can be calculated using Ag we write P for (P, O). For the 
knowledge modality, we assume that the agents know how they voted and so 
define for each j 6 d and preferences P, Q : 

(P,0)P ? :(Q,0') iff Pi = Qi 

The above relation does not take into account the fact that some agents may 
be in a more privileged position than other agents, formally represented by the 

5 Plurality cannot be produced this way, but other functions satisfying 1-4 can easily 
be found that do. 

' Defining Kripke structures over agents’ preferences has been studied by other au- 
thors. [18] has a similar semantics in a different context. 




Knowledge-Theoretic Properties of Strategic Voting 27 

knowledge graph from the previous section. If we have fixed a knowledge graph, 
then agent i not only knows how it itself voted, but also the (current) preferences 
of each of the agents reachable from it in the knowledge graph. Let K, = (A, E/c) 
be a knowledge graph, and recall that Bi is the set of nodes reachable from i. 
Given two vectors of preferences P and Q and a group of agents G C A, we say 
Pg = Qg iff Pi = Qi for each i £ G. We can now define an epistemic relation 
based on 1C: 

(P, 0)Rf{Q, O') iff Pi = Q t and P Bi = Q Bi 

Clearly for each agent i and knowledge graph 1C, R is an equivalence rela- 
tion; hence each A',: is an S5 modal operator. The exact logic for the strategizing 
modalities depends on the properties of the ternary relation C. 

For the strategizing modalities, we define a relation A,: Q W x W as follows. 
Given preferences P,Q: 

(P, 0)Ai(Q, O’) iff P_, = Q-i and O’ O 

where P_j is all components of P except for the itli component. So, (P, O) and 
( Q , O') are A related iff they have the same jth component for all j ^ i and 
agent i prefers outcome O' to outcome O relative to i’s true preference P* . 

An election is a sequence of states. We say that an election E = 
(si,S 2 ,-.. ,s n ) respects the strategizing function / if /(sj) = Sj+i for i = 
1, . . . , n — 1. We assume always that / is such that f(s) = s unless the agent 
knows that it can strategize and get a better outcome, and how it should so 
strategize. A model for V is a tuple M = ( W,Ri,Ai,V ) where V : W — > 2®° 
(where is the set of primitive propositions). We assume that all relations P, 
are based on a given knowledge graph YC. Let (P, O) £ W be any state; we define 
truth in a model as follows: 

1. (P, O) |= p iff p € V (P, O ) and p £ <Pq 

2. (P,0)h^iff(P,0)N 

3. (P, O) |= (f) A ip iff (P, O) \= (j> and P f= ip 

4. (P, O) b I<i<P iff for all (Q, O') such that (P, 0)Pf (Q, O'), (Q, O') |= 0 

5. (P, O) |= Oi(j) iff there is ( Q , O') such that (P, 0)Ai(Q, O') and ( Q , O') b </> 

Nothing in our definition of a model forces primitive propositions to have their 
intended meaning. We therefore make use of the following definition. 

Definition 3. A valuation function V is an appropriate valuation for a 

model M.iffV satisfies the following conditions. Let V = (A, O, {P* Ag, /) 
he a voting model and M a model based on V. Let ( P,0 ) £ W be any state. 
Then: 

1. For each i £ A, P, £ V(P,0) iff P represents the preference Pi. 

2. For each P o, P o £ V(P,0) iff P represents O. 




28 



S. Chopra, E. Pacuit, and R. Parikli 



We assume that valuation functions are appropriate for the corresponding model. 

The following formula implies strategizing for an individual agent. It says that 
agent i knows that the outcome is Pq and by reporting a different preference a 
preferred outcome can be achieved. 



I<i(Po A 0*T) 



We are now in a position to present our last main result. 

Theorem 3. Given a voting system V = {A, O, {P*}i^Ai Ag, /), a knowledge 
graph K, and a model M for V, let E be an election that respects the strategizing 
function f. If there is a state P such that Ei = P for some l and P |= ->Ki(Po A 
OjT) for all i, then P is a fixed point of f. Equivalently, Given an election E 
that respects f and some k such that Ek + 1 yf E i.e., E f is not a fixed point of 
f , then 3i € A such that: 



Ek^KiiPoAOiT) 

That is, if an agent strategizes at some stage in the election then the agent knows 
that this strategizing will result in a preferred outcome. 

5 Conclusion 

We have explored some properties of strategic voting and noted that the 
Gibbard-Satterthwaite theorem only applies in those situations where agents can 
obtain the appropriate knowledge. Note that our example in the Introduction 
showed how strategizing can lead to a rational outcome in elections. In our ex- 
ample the Condorcet winner - the winner in pairwise head-to-head contests - was 
picked via strategizing. Since our framework makes it possible to view opinion 
polls as the n — 1 stages of an n-stage election, it implies that communication of 
voters’ preferences and the results of opinion polls can play an important role in 
ensuring rational outcomes to elections. A similar line of reasoning in a different 
context can be found in [15]. Put another way, while the Gibbard-Satterthwaite 
theorem implies that we are stuck with voting mechanisms susceptible to strate- 
gizing, our work indicates ways for voters to avoid irrational outcomes using such 
mechanisms. Connections such as those explored in this paper are also useful in 
deontic contexts [10,16] i.e., an agent can only be obligated to take some action 
if the agent is in possession of the requisite knowledge. 

For future work, we note that in this study, we left the definition of the 
agents’ strategy choice function informal, thus assuming that agents have some 
way of deciding which preference to report if given a choice. This can be made 
more formal. We could then study the different strategies available to the agents. 
For example, some agents may only choose to strategize if they are guaranteed 
to get a better outcome, whereas other agents might strategize even if there is 
only a small chance of getting a better outcome. 




Knowledge-Theoretic Properties of Strategic Voting 



29 



Another question suggested by this framework is: what are the effects of 
different levels of knowledge of the current preferences on individual strategy 
choices? Suppose that among agent i and agent j, both i and j’s true preferences 
are common knowledge. Now when agent i is trying to decide whether or not to 
strategize, i knows that j will be able to simulate V s reasoning. Thus if i chooses 
a strategy based on j’s true preference, i knows that j will choose a strategy 
based on i’s choice of strategy, and so i must choose a strategy based on j’s 
response to i’s original strategy. We conjecture that if there is only pairwise 
common knowledge among the agents of the agents’ true preferences, then the 
announcement of the agents’ true preferences is a stable announcement. 

On a technical note, the logic of knowledge we developed uses S5 modalities. 
We would like to develop a logic that uses KD45 modalities - i.e., a logic of belief. 
This is because beliefs raise the interesting issue that a voter - or groups of voters 
- can have possibly inconsistent beliefs about other voters’ preferences, while this 
variation is not possible in the knowledge case. Another area of exploration will 
be connections with other distinct approaches to characterize game theoretic 
concepts in modal logic such as [8,3,18]. Lastly, a deeper formal understanding 
of the relationship between the knowledge and strategizing modalities introduced 
in this paper will become possible after the provision of an appropriate axiom 
system for /CV. Our work is a first step towards clarifying the knowledge-theoretic 
properties of voting, but some insight into the importance of states of knowledge 
and the role of opinion polls is already at hand. 

References 

1. K. J. Arrow. Social choice and individual values (2nd ed.). Wiley, New York, 1963. 

2. Jean-Pierre Benoit. Strategic manipulation in games when lotteries and ties are 
permitted. Journal of Economic Theory , 102:421-436, 2002. 

3. Giacomo Bonanno. Modal logic and game theory: two alternative approaches. Risk 
Decision and Policy, 7:309-324, 2002. 

4. Steven J. Brams. Voting Procedures. In Handbook of Game Theory, volume 2, 
pages 1055-1089. Elsevier, 1994. 

5. Steven J. Brams and Peter C. Fisliburn. Voting Procedures. In Handbook of Social 
Choice and Welfare. North-Holland, 1994. 

6. Patricia Everaere, Sebastien Konieczny, and Pierre Marquis. On merging strategy- 
proofness. In Proceedings of KR 2004- Morgan-Kaufmann, 2004. 

7. Allan Gibbard. Manipulation of Voting Schemes: A General Result. Econometrica, 
41(4):587-601, 1973. 

8. Paul Harrenstein, Wiebe van der Hoek, John-.Iules Meyer, and Gees Witteveen. A 
modal characterization of nash eciuilibira. Fundamenta Informaticae, 57(2-41:281- 
321, 2002. 

9. Sebastien Konieczny and Ramon Pino-Perez. On the logic of merging. In A. G. 
Cohn, L. Schubert, and S. C. Shapiro, editors, Principles of Knowledge Represen- 
tation and Reasoning: Proceedings of the Sixth International Conference (KR ’98), 
pages 488-498, San Francisco, California, 1998. Morgan Kaufmann. 

10. Alessio Lomuscio and Marek Sergot. Deontic interpreted systems. Studia Logica, 
75, 2003. 




30 



S. Chopra, E. Pacuit, and R. Parikli 



11. Pedrito Maynard- Zhang and Daniel Lehmann. Representing and aggregating con- 
flicting beliefs. Journal of Artificial Intelligence , 19:155-203, 2003. 

12. Pedrito Maynard-Zhang and Yoav Shoham. Belief fusion: Aggregating pedigreed 
belief states. Journal of Logic, Language and Information, 10(2):183-209, 2001. 

13. Thomas Meyer, Aditya Chose, and Samir Chopra. Social choice theory, merging 
and elections. In Proceedings of Sixth European Conference on Symbolic and Quan- 
titative Approaches to Reasoning with Uncertainty, ECSQARU-2001. Springer- 
Verlag, 2001. 

14. Eric Pacuit and Rohit Parikh. A logic of communication graphs. In Proceedings 
of DALT-04 • Springer- Verlag, 2004. 

15. Rohit Parikh. Social software. Synthese, 132, 2002. 

16. Rohit Parikh, Eric Pacuit, and Eva Cogan. The logic of knowledge based obliga- 
tions. In Proceedings of DALT-2004 . Springer Verlag, 2004. 

17. Mark Satterthwaite. The Existence of a Strategy Proof Voting Procedure: a Topic 
in Social Choice Theory. PhD thesis, University of Wisconsin, 1973. 

18. J. van Benthem. Rational dynamics and epistemic logic in games. Technical report, 
ILLC, 2003. 




The CIFF Proof Procedure for Abductive Logic 
Programming with Constraints 



U. Endriss 1 , P. Mancarella 2 , F. Sadri 1 , G. Terreni 2 , and F. Toni 1,2 

1 Department of Computing, Imperial College London 
{ue ,f s ,ft}@doc .ic.ac.uk 
2 Dipartimento di Informatica, Universita di Pisa 
{paolo .terreni ,toni}@di .unipi . it 



Abstract. We introduce a new proof procedure for abductive logic pro- 
gramming and present two soundness results. Our procedure extends that 
of Fung and Kowalski by integrating abductive reasoning with constraint 
solving and by relaxing the restrictions on allowed inputs for which the 
procedure can operate correctly. An implementation of our proof pro- 
cedure is available and has been applied successfully in the context of 
multiagent systems. 



1 Introduction 

Abduction has found broad application as a tool for hypothetical reasoning with 
incomplete knowledge, which can be handled by labelling some pieces of informa- 
tion as abducibles , i.e. as possible hypotheses that can be assumed to hold, pro- 
vided that they are consistent with the given knowledge base. Abductive Logic 
Programming (ALP) combines abduction with logic programming enriched by 
integrity constraints to further restrict the range of possible hypotheses. Im- 
portant applications of ALP include planning [10], requirements specification 
analysis [8], and agent communication [9]. In recent years, a variety of proof 
procedures for ALP have been proposed, including the IFF procedure of Fung 
and Kowalski [4]. Here, we extend this procedure in two ways, namely (1) by 
integrating abductive reasoning with constraint solving (in the sense of CLP, not 
to be confused with integrity constraints), and (2) by relaxing the allowedness 
conditions given in [4] to be able to handle a wider class of problems. 

Our interest in extending IFF in this manner stems from applications devel- 
oped in the SOCS project, which investigates the use of computational logic- 
based techniques in the context of multiagent systems for global computing. In 
particular, we use ALP extended with constraint solving to give computational 
models for an agent’s planning , reactivity and temporal reasoning capabilities [5]. 
We found that our requirements for these applications go beyond available state- 
of-the-art ALP proof procedures. While ACLP [6], for instance, permits the use 
of constraint predicates (unlike IFF), its syntax for integrity constraints is too 
restrictive to express the planning knowledge bases (using a variant of the abduc- 
tive event calculus [10]) used in SOCS. In addition, many procedures put strong, 
sometimes unnecessary, restrictions on the use of variables. The procedure pro- 
posed in this paper, which we call CIFF, manages to overcome these restrictions 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, pp. 31—43, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




32 



U. Endriss et al. 



to a degree that has allowed us to apply it successfully to a wide range of prob- 
lems. We have implemented CIFF in Prolog; 1 the system forms an integral part 
of the PROSOCS platform for programming agents in computational logic [11]. 

In the next section we are going to set out the ALP framework used in this 
paper and discuss the notion of allowedness. Section 3 then specifies the CIFF 
proof procedure which we propose as a suitable reasoning engine for this frame- 
work. Two soundness results for CIFF are presented in Section 4 and Section 5 
concludes. An extended version of this paper that, in particular, contains detailed 
proofs of our results is available as a technical report [3] . 

2 Abductive Logic Programming with Constraints 

We use classical first-order logic, enriched with a number of special predicate 
symbols with a fixed semantics, namely the equality symbol =, which is used to 
represent the unifiability of terms (i.e. as in standard logic programming), and 
a number of constraint predicates. We assume the availability of a sound and 
complete constraint solver for this constraint language. In principle, the exact 
specification of the constraint language is independent from the definition of the 
CIFF procedure, because we are going to use the constraint solver as a black 
box component. 2 However, the constraint language has to include a relation 
symbol for equality (we are going to write t\ = c £ 2 ) and it must be closed under 
complements. In general, the complement of a constraint Con will be denoted as 
Con (but we are going to write tp^^i for the complement of t\ = c 1 2 ). The range 
of admissible arguments to constraint predicates again depends on the specifics 
of the chosen constraint solver. A typical choice for a constraint system would 
be an arithmetic constraint solver over integers providing predicates such as < 
and > and allowing for terms constructed from variables, integers and function 
symbols representing operations such as addition and multiplication. 

Abductive logic programs. An abductive logic program is a pair (Th,IC) 
consisting of a theory Th and a finite set of integrity constraints IC. We present 
theories as sets of so-called iff-definitions: 

p(X i,...,Xk) -O- D\ V • • • V D n 

The predicate symbol p must not be a special predicate (constraints, =, T and _L) 
and there can be at most one iff-definition for every predicate symbol. Each of 
the disjuncts £),; is a conjunction of literals. Negative literals are written as impli- 
cations (e.g. q(X,Y) o- _L). The variables Xi,. . . ,X^ are implicitly universally 
quantified with the scope being the entire definition. Any other variable is im- 
plicitly existentially quantified, with the scope being the disjunct in which it 
occurs. A theory may be regarded as the (selective) completion of a normal logic 
program (i.e. of a logic program allowing for negative subgoals in a rule) [2]. Any 
predicate that is neither defined nor special is called an abducible. 

1 The CIFF system is available at http://www.doc.ic.ac.uk/~ue/ciff/. 

2 Our implementation uses the built-in finite domain solver of Sicstus Prolog [1], but 
the modularity of the system would also support the integration of a different solver. 




The CIFF Proof Procedure for Abductive Logic Programming 



33 



In this paper, the integrity constraints in the set IC (not to be confused 
with constraint predicates) are implications of the following form: 

L\ A • • • A L m — ^ A\ V • • • V A n 

Each of the L,; must be a literal (with negative literals again being written in 
implication form); each of the A, must be an atom. Any variables are implicitly 
universally quantified with the scope being the entire implication. 

A query Q is a conjunction of literals. Any variables in Q are implicitly 
existentially quantified. They are also called the free variables. In the context of 
the CIFF procedure, we are going to refer to a triple ( Th,IC,Q ) as an input. 



Semantics. A theory provides definitions for certain predicates, while integrity 
constraints restrict the range of possible interpretations. A query may be re- 
garded as an observation against the background of the world knowledge encoded 
in a given abductive logic program. An answer to such a query would then pro- 
vide an explanation for this observation: it would specify which instances of the 
abducible predicates have to be assumed to hold for the observation to hold 
as well. In addition, such an explanation should also validate the integrity con- 
straints. This is formalised in the following definition: 

Definition 1 (Correct answer). A correct answer to a query Q with respect 
to an abductive logic program (Th,IC) is a pair (A, a), where A is a finite set 
of ground abducible atoms and a is a substitution for the free variables occurring 
in Q, such that Th U Comp(A) |= IC A Qa. 

Here |= is the usual consequence relation of first-oder logic with the restriction 
that constraint predicates have to be interpreted according to the semantics 
of the chosen constraint system and equalities evaluate to true whenever their 
two arguments are unifiable. Comp(A) stands for the completion of the set of 
abclucibles in A, i.e. any ground atom not occurring in A is assumed to be false. 
If we have Th U IC |= ~>Q (i.e. if Q is false for all instantiations of the free 
variables), then we say that there exists no correct answer to the query Q given 
the abductive logic program (Th,IC). 

Example 1. Consider the following abductive logic program: 

Th : p(T) O q(X,T') A T'<T A T<8 
q(X, T) o X=a A s(T) 

IC: r(T) p(T) 

The set of abducible predicates is {r, s}. The query r( 6), for instance, should 
succeed; a possible correct answer would be the set (r(6),s(5)}, with an empty 
substitution. Intuitively, given the query r(6), the integrity constraint in IC 
would fire and force the atom p{ 6) to hold, which in turn requires s(T') for some 
T' < 6 to be true (as can be seen by unfolding first p( 6) and then q(X, T')). □ 




34 



U. Endriss et al. 



Allowedness. Fung and Kowalski [4] require inputs (Th, IC , Q) to meet a number 
of so-called allowedness conditions to be able to guarantee the correct operation 
of their proof procedure. These conditions are designed to avoid constellations 
with particular (problematic) patterns of quantification. Unfortunately, it is 
difficult to formulate appropriate allowedness conditions that guarantee a cor- 
rect execution of the proof procedure without imposing too many unnecessary 
restrictions. This is a well-known problem, which is further aggravated for 
languages that include constraint predicates. Our proposal is to tackle the issue 
of allowedness dynamically, i.e. at runtime, rather than adopting a static and 
overly strict set of conditions. In this paper, we are only going to impose the 
following minimal allowedness conditions: 3 

— An integrity constraint A — > B is allowed iff every variable in it also occurs 
in a positive literal within its antecedent A. 

— An iff-definition p(X i, . . . , Xk) f> f*i V ■ • ■ V D n is allowed iff every variable 
other than Xi, . . . , Xk occurring in a disjunct Di also occurs inside a positive 
literal within the same Di. 

The crucial allowedness condition is that for integrity constraints: it ensures that, 
also after an application of the negation rewriting rule (which moves negative 
literals in the antecedent of an implication to its consequent), every variable 
occurring in the consequent of an implication is also present in its antecedent. 
The allowedness condition for iff-definitions merely allows us to maintain this 
property of implications when the unfolding rule (which, essentially, replaces a 
defined predicate with its definition) is applied to atoms in the antecedent of an 
implication. We do not need to impose any allowedness conditions on queries. 

3 The CIFF Proof Procedure 

We are now going to formally introduce the CIFF proof procedure. The input 
(Th, IC, Q) to the procedure consists of a theory Th, a set of integrity constraints 
IC, and a query Q. There are three possible outputs: (1) the procedure succeeds 
and indicates an answer to the query Q ; (2) the procedure fails, thereby indi- 
cating that there is no answer; and (3) the procedure reports that computing an 
answer is not possible, because a critical part of the input is not allowed. 

The CIFF procedure manipulates, essentially, a set of formulas that are ei- 
ther atoms or implications. The theory Th is kept in the background and is only 
used to unfold defined predicates as they are being encountered. In addition to 
atoms and implications the aforementioned set of formulas may contain disjunc- 
tions of atoms and implications to which the splitting rule may be applied, i.e. 
which give rise to different branches in the proof search tree. The sets of formulas 
manipulated by the procedure are called nodes. A node is a set (representing a 
conjunction) 4 of formulas (atoms, implications, or disjunctions thereof) which 

3 Note that the CIFF procedure could easily be adapted to work also on inputs not 
conforming even to these minimal conditions, but then it would not be possible 
anymore to represent quantification implicitly. 

4 If a proof rule introduces a conjunction into a node, this conjunction is understood 
to be broken up into its subformulas right away. 




The CIFF Proof Procedure for Abductive Logic Programming 



35 



are called goals. A proof is initialised with the node containing the integrity con- 
straints IC and the literals of the query Q. The proof procedure then repeatedly 
manipulates the current node of goals by rewriting goals in the node, adding 
new goals to it, or deleting superfluous goals from it. Most of this section is 
concerned with specifying these proof rules in detail. 

The structure of our proof rules guarantee that the following quantification 
invariants hold for every node in a derivation: 

— No implication contains a universally quantified variable that is not also 
contained in one of the positive literals in its antecedent. 

— No atom contains a universally quantified variable. 

— No atom inside a disjunction contains a universally quantified variable. 

In particular, these invariants subsume the minimal allowedness conditions dis- 
cussed in the previous section. The invariants also allow us to keep quantification 
implicit throughout a CIFF derivation by determining the quantification status 
of any given variable. Most importantly, any variable occurring in either the 
original query or an atomic conjunct in a node must be existentially quantified. 

Notation. In the sequel, we are frequently going to write t for a “vector” of 
terms such as t\,...,tk- For instance, we are going to write p(t) rather than 
p{t\, . . . ,tk). To simplify presentation, we assume that there are no two predi- 
cates that have the same name but different arities. We are also going to write 
t = s as a shorthand for t\ = Si A • • • A tk = Sfc (with the implicit assump- 
tion that the two vectors have the same length), and [ X/f\ for the substitution 
[Xi/t\ 1 . . . ,Xk/tk\- Note that X and Y always represent variables. Furthermore, 
in our presentation of proof rules, we are going to abstract from the order of con- 
juncts in the antecedent of an implication: the critical subformula is always rep- 
resented as the first conjunct. That is, by using a pattern such as A = t/\A— > B 
we are referring to any implication with an antecedent that has a conjunct of 
the form X = t. A represents the remaining conjunction, which may also be 
“empty”, that is, the formula X = t — > B is a special case of the general pattern 
X = t /\ A — > B. In this case, the residue A — >■ B represents the formula B. 

Proof rules. For each of the proof rules in our system, we specify the type of 
formula(s) which may trigger the rule ( “given'’), a number of side conditions that 
need to be met, and the required action (such as replacing the given formula by 
a different one). Executing this action yields one or more successor nodes and 
the current node can be discarded. The first rule replaces a defined predicate 
occurring as an atom in the node by its defining disjunction: 

Unfolding atoms 
Given: p(t) 

Cond.: \p{X) Ti V • • • V D n \ € Th 
Action: replace by (Di V • • • V D n )[X/t\ 

Note that any variables in D\ V • • • V D n other than those in X are existentially 
quantified with respect to the definition, i.e. they must be new to the node and 
they will be existentially quantified in the successor node. 




36 



U. Endriss et al. 



Unfolding predicates in the antecedent of an implication yields one new im- 
plication for every disjunct in the defining disjunction: 

Unfolding within implications 

Given: p(t) A A — » B 

Cond.: \p(X ) D\ V • • • V D n ] £ Th 

Action: replace by Di[X/t\ A A — » B, . . . , D n [X/t\ A A — » B 

Observe that variables in any of the Dj that have been existentially quantified in 
the definition of p(t) are going to be universally quantified in the corresponding 
new implication (because they appear within the antecedent). 

The next rule is the propagation rule, which allows us to resolve an atom in 
the antecedent of an implication with a matching atom in the node. Unlike most 
rules, this rule does not replace a given formula, but it merely adds a new one. 
This is why we require explicitly that propagation cannot be applied again to 
the same pair of formulas. Otherwise the procedure would be bound to loop. 

Propagation 

Given: p(F) A A — » B and p(s ) 

Cond.: the rule has not yet been applied to this pair of formulas 
Action: add t = s A A — > B 



The splitting rule gives rise to (not just a single but) a whole set of successor 
nodes, one for each of the disjuncts in Ai V • • • V A n , each of which gives rise to 
a different branch in the derivation: 

Splitting 

Given: Ai V • • • V A„ 

Cond.: none 

Action: replace by one of Ai, . . . , A n 



The next rule is a logical simplification that moves negative literals in the an- 
tecedent to the consequent of an implication: 

Negation rewriting 
Given: (A — > _L) A B — > C 
Cond.: none 

Action: replace by B — > A V C 



There are two further logical simplification rules: 

Logical simplification (trivial condition) 

Given: T A A ^ B 

Cond.: none 

Action: replace by A B 



Logical simplification (redundant formulas) 
Given: either 1 -> d or T 
Cond.: none 
Action: delete formula 




The CIFF Proof Procedure for Abductive Logic Programming 



37 



The following factoring rule can be used to separate cases in which particular 
abducible atoms unify from those in which they do not: 

Factoring 

Given: p(t) and p(s) 

Cond.: p abducible; the rule has not yet been applied to p{i) and p(s) 

Action: replace by [p(t) A p(s) A (t = s—¥ 1’)] V [p(f) A t = s\ 

The next few rules deal with equalities. The first two of these involve simplifying 
equalities according to the following rewrite rules: 

(1) Replace f(t i, ...,t k ) = /(si, . . . , s k ) by t\ = si A • • • A t k = s k . 

(2) Replace /(fi, ■■■,t k ) = g(s i, . . . , sj) by T if / and g are distinct or k^l. 

(3) Replace t = t by T. 

(4) Replace X = t by _L if t contains X. 

(5) Replace t = X by X = t if X is a variable and t is not. 

(6) Replace Y = X by X = Y if X is a univ. quant, variable and Y is not. 

(7) Replace Y = X by X = Y ii X and Y are exist, quant, variables and X 
occurs in a constraint predicate, but Y does not. 

Rules (l)-(4) essentially implement the term reduction part of the unification 
algorithm of Martelli and Montanari [7]. Rules (5)-(7) ensure that completely 
rewritten equalities are always presented in a normal form, thereby simplifying 
the formulation of our proof rules. 

Equality rewriting for atoms 
Given: t\ = t 2 

Cond.: the rule has not yet been applied to this equality 
Action: replace by the result of rewriting t\ = t 2 



Equality rewriting for implications 
Given: ti = t 2 A A — > B 

Cond.: the rule has not yet been applied to this equality 

Action: replace by C A A — >• B where C is the result of rewriting ti = t 2 

The following two substitution rules also handle equalities: 

Substitution rule for atoms 
Given: X = t 

Cond.: X £ t; the rule has not yet been applied to this equality 
Action: apply substitution [X/t\ to entire node except X = t itself 



Substitution rule for implications 
Given: X = t A A — » B 

Cond.: X univ. quant.; A'^f; t contains no univ. quant, variables or X^B 
Action: replace by ( A B)[X/t\ 

The purpose of the third side condition (of t not containing any universally 
quantified variables or X not occurring within B) is to maintain the quantifi- 
cation invariant that any universally quantified variable in the consequent of an 
implication is also present in the antecedent of the same implication. 

If neither equality rewriting nor a substitution rule are applicable, then an 
equality may give rise to a case analysis: 




38 



U. Endriss et al. 



Case analysis for equalities 

Given: X = t A A — > B (exception: do not apply to X = t — ¥ _L) 

Cond.: X exist, quant.; X 0 t; t is not a univ. quant, variable 
Action: replace by X = t and A — > B, or replace by X = t — » _L 

Case analysis should not be applied to formulas of the form X = t — > _L (despite 
this being an instance of the pattern X = t A A —> B), because this would lead 
to a loop (with respect to the second successor node). Also note that, if the third 
of the above side conditions was not fulfilled and if t was a universally quantified 
variable, then equality rewriting could be applied to obtain t = X A A — > B, to 
which we could then apply the substitution rule for implications. 

Observe that the above rule gives rise to two successor nodes (rather than 
a disjunction) . This is necessary, because the term t may contain variables that 
would be quantified differently on the two branches, i.e. a new formula with 
a disjunction in the matrix would not (necessarily) be logically equivalent to 
the disjunction of the two (quantified) subformulas. In particular, in the first 
successor node all variables in t will become existentially quantified. To see this, 
consider the example of the implication X = f(Y) A A — > B and assume X 
is existentially quantified, while Y is universally quantified. We can distinguish 
two cases: (1) either X represents a term whose main functor is /, or (2) this 
is not the case. In case (1), there exists a value for Y such that X = f(Y), and 
furthermore A — >• B must hold. Otherwise, i.e. in case (2), X = f{Y) will be 
false for all values of Y . 

Case analysis for constraints 
Given: Con A A — » B 

Cond.: Con is a constraint predicate without univ. quant, variables 
Action: replace by [Con A (A — > B)] V Con 



Observe that the conditions on quantification are a little stricter for case analysis 
for constraints than they were for case analysis for equalities. Now all variables 
involved need to be existentially quantified. This simplifies the presentation of 
the rule a little, because no variables change quantification. In particular, we 
can replace the implication in question by a disjunction (to which the splitting 
rule may be applied in a subsequent step) . 

While case analysis is used to separate constraints from other predicates, the 
next rule provides the actual constraint solving step itself. It may be applied to 
any set of constraints in a node, but to guarantee soundness, eventually, it has 
to be applied to the set of all constraint atoms. 

Constraint solving 

Given: constraint predicates Coni, . . . , Con n 
Cond.: (Coni, . . . , Con n } is not satisfiable 
Action: replace by T 

If {Coni, . . . , Con n } is found to be satisfiable it may also be replaced with 
an equivalent but simplified set (in case the constraint solver used offers this 
feature). To simplify presentation, we assume that the constraint solver will fail 
(rather than come back with an undefined answer) whenever it is presented with 




The CIFF Proof Procedure for Abductive Logic Programming 39 

an ill-defined constraint such as, say, bob < 5 (in the case of an arithmetic solver). 
For inputs that are “well- typed” , however, such a situation will never arise. 

Our next two rules ensure that (dis) equalities that affect the satisfiability of 
the constraints in a node are correctly rewritten using the appropriate constraint 
predicates. Here we refer to a variable as a constraint variable (with respect to 
a particular node) iff that variable occurs inside a constraint atom in that node. 
For the purpose of stating the next two rules in a concise manner, we call a term 
c-atomic iff it is either a variable or a ground element of the constraint domain 
(e.g. an integer in the case of an arithmetic domain). 

Equality-constraint rewriting 
Given: X = t 

Cond.: X is a constraint variable 

Action: replace by X = c t if t is c-atomic; replace by _L otherwise 



Disequality-constraint rewriting 

Given: X = t — > _L 

Cond.: A is a constraint variable 

Action: replace by X^ c t if t is c-atomic; delete formula otherwise 



For example, if we are working with an arithmetic constraint domain, then the 
formula X = bob — > _L would be deleted from the node as it holds vacuously 
whenever X also occurs within a constraint predicate. 

We call a formula of the form t\ = — > -L a disequality provided no uni- 

versally quantified variables occur in either t\ or £ 2 ■ The next rule is used to 
identify nodes containing formulas with problematic quantification, which could 
cause difficulties in extracting an abductive answer: 

Dynamic allowedness rule (DAR) 

Given: A — > B (exception: do not apply to disequalities) 

Cond.: A consists of equalities and constraints alone; no other rule applies 
Action: label node as undefined 

In view of the second side condition, recall that the only rules applicable to an 
implication with only equalities and constraints in the antecedent are the equality 
rewriting and substitution rules for implications and the two case analysis rules. 

Answer extraction. A node containing _L is called a failure node. If all branches 
in a derivation terminate with failure nodes, then the derivation is said to fail 
(the intuition being that there exists no answer to the query). A node to which 
no more rules can be applied is called a final node. A final node that is not a 
failure node and that has not been labelled as undefined is called a success node. 

Definition 2 (Extracted answer). An extracted answer for a final success 
node N is a triple {A,<P,r), where A is the set of abducible atoms, is the set 
of equalities and disequalities, and r is the set of constraint atoms in N. 



An extracted answer in itself is not yet a correct answer in the sense of Defini- 
tion 1, but — as we shall see — it does induce such a correct answer. The basic idea 




40 



U. Endriss et al. 



is to first define a substitution a that is consistent with both the (dis)equalities 
in <P and the constraints in T, and then to ground the set of abducibles A by 
applying a to it. The resulting set of ground abducible atoms together with the 
substitution a then constitutes a correct answer to the query (i.e., an extracted 
answer will typically give rise to a whole range of correct answers). To argue 
that this is indeed possible, i.e. to show that the described procedure of deriving 
answers to a query is a sound operation, will be the subject of the next section. 

Example 2. We show the derivation for the query r(6) given the abductive logic 
program of Example 1. Recall that CIFF is initiated with the node Nq composed 
of the query and the integrity constraints in IC. 



No : r(6) A [r(T) p(T)] 

N-l : r( 6) A [T= 6 p(T)\ A [r(T) -> p(T)\ 

N 2 : r(6)Ap(6) A \r(T) -> p(T)\ 

N 3 : r(6) A q(X, T') A T'< 6 A 6 < 8 A [r(T) -> p(T)\ 
N 4 : r( 6) A q(X, T ) A T'< 6 A [r(T) p(T)\ 

N 5 : r(6) A X = a A s(T') A T'< 6 A [r(T) p(T)] 



[initial node] 

[by propagation] 

[by substitution] 

[by unfolding] 

[by constraint solving] 
[by unfolding] 



No more rules can be applied to the node N§ and it neither contains _L nor has 
it been labelled as undefined. Hence, it is a success node and we get an extracted 
answer with A = (r(6), s(T')}, <d> = {X = a} and r = {T'<6}, of which the 
correct answer given in Example 1 is an instance. □ 



4 Soundness Results 

In this section we are going to present the soundness of the CIFF procedure 
with respect to the semantics of a correct answer to a given query. Due to space 
restrictions, we have to restrict ourselves to short sketches of the main ideas 
involved. Full proofs may be found in [3]. Our results extend those of Fung 
and Kowalski for the original IFF procedure in two respects: (1) they apply to 
abductive logic programs with constraints, and (2) they do not rely on a static 
(and overly strict) definition of allowedness. 

For an abductive proof procedure, we can distinguish two types of soundness 
results: soundness of success and soundness of failure. The first one establishes 
the correctness of derivations that are successful (soundness of success) : whenever 
the CIFF procedure terminates successfully then the extracted answer (consisting 
of a set of abducible atoms, a set of equalities and disequalities, and a set of 
constraints) gives rise to a true answer according to the semantics of ALP (i.e. 
a ground set of abducible atoms and a substitution) . Note that for this result to 
apply, it suffices that a single final success node can be derived. This node will 
give rise to a correct answer, even if there are other branches in the derivation 
that do not terminate or for which the DAR has been triggered. The second 
soundness result applies to derivations that fail (soundness of failure): it states 
that whenever the CIFF procedure fails then there is indeed no answer according 
to the semantics. This result applies only when all branches in a derivation have 
failed; if there are branches that do not terminate or for which the DAR has 




The CIFF Proof Procedure for Abductive Logic Programming 



41 



been triggered, then we cannot draw any conclusions regarding the existence of 
an answer to the query (assuming there are no success nodes). 

The proofs of both these results heavily rely on the fact that our proof rules 
are equivalence preserving: 

Lemma 1 (Equivalence preservation). If N is a node in a derivation with 
respect to the theory Th, and Af is the disjunction of the immediate successor 
nodes of N in that derivation, then Th\= N -s-4 A f. 

Note that the disjunction M will have only a single disjunct whenever the rule 
applied to N is neither splitting nor case analysis for equalities. Equivalence 
preservation is easily verified for most of our proof rules. Considering that IC AQ 
is the initial node of any derivation, the next lemma then follows by induction 
over the number of proof steps leading to a final success node: 

Lemma 2 (Final nodes entail initial node). If N is a final success node for 
the input (Th,IC,Q), then Th\= N ( IC A Q ). 

Our third lemma provides the central argument in showing that it is possible to 
extract a correct abductive answer from a final success node: 

Lemma 3 (Answer extraction). If N is a final success node and A is 
the set of abducible atoms in N, then there exists a substitution a such that 
Comp(Aa) |= Na. 

The first step in proving this lemma is to show that any formulas in N that are 
not directly represented in the extracted answer must be implications where the 
antecedent includes an abducible atom and no negative literals. We can then 
show that implications of this type are logical consequences of Comp(Aa) by 
distinguishing two cases: either propagation has been applied to the implication 
in question, or it has not. In the latter case, the claim holds vacuously (because 
the antecedent is not true); in the former case we use an inductive argument 
over the number of abducible atoms in the antecedent. 

The full proof of Lemma 3 makes reference to all proof rules except factoring. 
Indeed, factoring is not required to ensure soundness. However, as can easily be 
verified, factoring is equivalence preserving in the sense of Lemma 1; that is, our 
soundness results apply both to the system with and to the system without the 
factoring rule. We are now ready to state these soundness results: 

Theorem 1 (Soundness of success). If there exists a successful derivation 
for the input {Th, IC,Q) , then there exists a correct answer for that input. 

Theorem 2 (Soundness of failure). If there exists a derivation for the input 
{ Th, IC, Q) that terminates and where all final nodes are failure nodes, then there 
exists no correct answer for that input. 

Theorem 1 follows from Lemmas 2 and 3, while Theorem 2 can be proved by 
induction over the number of proof steps in a derivation, using Lemma 1 in the 
induction step. We should stress that these soundness results only apply in cases 
where the DAR has not been triggered and the CIFF procedure has terminated 




42 



U. Endriss et al. 



with a defined outcome, namely either success or failure. Hence, such results are 
only interesting if we can give some assurance that the DAR is “appropriate” : In 
a similar but ill-defined system where an (inappropriate) allowedness rule would 
simply label all nodes as undefined, it would still be possible to prove the same 
soundness theorems, but they would obviously be of no practical relevance. 

The reason why our rule is indeed appropriate is that extracting an answer 
from a node labelled as undefined by the DAR would either require us to extend 
the definition of a correct answer to allow for infinite sets of abducible atoms 
or at least involve the enumeration of all the solutions to a set of constraints. 
We shall demonstrate this by means of two simple examples. First, consider the 
following implication: X — f(Y) — > (A) 



If both X and Y are universally quantified, then this formula will trigger the 
DAR. Its meaning is that the predicate p is true whenever its argument is of the 
form /(_). Hence, an “answer” induced by a node containing this implication 
would have to include the infinite set {p(f(ti)),p(f(t 2 )),---}, where 
stand for the terms in the Herbrand universe. This can also be seen by considering 
that, if we were to ignore the side conditions on quantification of the substitution 
rule for implications, the above implication could be rewritten as p(f(Y)), with 
Y still being universally quantified. 

For the next example, assume that our constraint language includes the 
predicate < with the usual interpretation over integers: 

3 < A A A < 100 p{ A) 

Again, if the variable A is universally quantified, this formula will trigger the 
DAR. While it would be possible to extract a finite answer from a node including 
this formula, this would require us to enumerate all solutions to the constraint 
3<A A A<100; that is, a correct answer would have to include the set of atoms 
{p(4),p(5), . - . ,p(99)}. In cases where the set of constraints concerned has an 
infinite number of solutions, even in theory, it is not possible to extract a correct 
answer (as it would be required to be both ground and finite). 



5 Conclusion 

We have introduced a new proof procedure for ALP that extends the IFF pro- 
cedure in a non-trivial way by integrating abductive reasoning with constraint 
solving. Our procedure shares the advantages of the IFF procedure [4], but cov- 
ers a larger class of inputs: (1) predicates belonging to a suitable constraint 
language may be used, and (2) the allowedness conditions have been reduced to 
a minimum. Both these extension are important requirements for our applica- 
tions of ALP to modelling and implementing autonomous agents [5,11]. In cases 
where no answer is possible due to allowedness problems, the CIFF procedure 
will report this dynamically. However, if an answer is possible despite such prob- 
lems, CIFF will report a defined answer. For instance, one node may give rise 
to a positive answer while another has a non-allowed structure, or a derivation 
may fail correctly for reasons that are independent of a particular non-allowed 




The CIFF Proof Procedure for Abductive Logic Programming 



43 



integrity constraint. For inputs conforming to any appropriate static allowedness 
definition, the DAR will never be triggered. 

We have proved two soundness results for CIFF: soundness of success and 
soundness of failure. Together these two results also capture some aspect of com- 
pleteness: For any class of inputs that are known to be allowed (in the sense of 
never triggering the DAR) and for which termination can be guaranteed (for in- 
stance, by imposing suitable acyclicity conditions [12]) the CIFF procedure will 
terminate successfully whenever there exists a correct answer according to the 
semantics. We hope to investigate the issues of termination and completeness 
further in our future work. Another interesting issue for future work on CIFF 
would be to investigate different strategies for proof search and other optimisa- 
tion techniques. Such research could then inform an improvement of our current 
implementation and help to make it applicable to more complex problems. 

Acknowledgements. This work was partially funded by the IST-FET pro- 
gramme of the European Commission under the IST-2001-32530 SOCS project, 
within the Global Computing proactive initiative. The last author was also sup- 
ported by the Italian MIUR programme “Rientro dei cervelli” . 

References 

[1] M. Carlsson, G. Ottosson, and B. Carlson. An open-ended finite domain constraint 
solver. In Proc. PLILP-1997 , 1997. 

[2] K. L. Clark. Negation as failure. In Logic and Data Bases. Plenum Press, 1978. 

[3] U. Endriss, P. Mancarella, F. Sadri, G. Terreni, and F. Toni. The CIFF proof pro- 
cedure: Definition and soundness results. Technical Report 2004/2, Department 
of Computing, Imperial College London, May 2004. 

[4] T. H. Fung and R. A. Kowalski. The IFF proof procedure for abductive logic 
programming. Journal of Logic Programming, 33(2):151 165, 1997. 

[5] A. C. Kakas, P. Mancarella, F. Sadri, K. Stathis, and F. Toni. The KGP model 
of agency. In Proc. ECAI-200f , 2004. To appear. 

[6] A. C. Kakas, A. Michael, and C. Mourlas. ACLP: Abductive constraint logic 
programming. Journal of Logic Programming, 44:129-177, 2000. 

[7] A. Martelli and U. Montanari. An efficient unification algorithm. ACM Transac- 
tions on Programming Languages and Systems, 4(2):258-282, 1982. 

[8] A. Russo, R. Miller, B. Nuseibeh, and J. Kramer. An abductive approach for 
analysing event-based requirements specifications. In Proc. ICLP-2002. Springer- 
Verlag, 2002. 

[9] F. Sadri, F. Toni, and P. Torroni. An abductive logic programming architecture 
for negotiating agents. In Proc. JELIA-2002. Springer- Verlag, 2002. 

[10] M. Shanahan. An abductive event calculus planner. Journal of Logic Program- 
ming, 44:207-239, 2000. 

[11] K. Stathis, A. Kakas, W. Lu, N. Dometriou, U. Endriss, and A. Bracciali. 
PROSOCS: A platform for programming software agents in computational logic. 
In Proc. AT2AI-2004, 2004. 

[12] I. Xanthakos. Semantic Integration of Information by Abduction. PhD thesis, 
Department of Computing, Imperial College London, 2003. 




Hierarchical Decision Making by Autonomous Agents 



Stijn Heymans, Davy Van Nieuwenborgh*, and Dirk Vermeil'" 



Dept, of Computer Science 
Vrije Universiteit Brussel, VUB 
Pleinlaan 2, B1050 Brussels, Belgium 
{sheymans , dvnieuwe ,dvermeir}@vub .ac.be 



Abstract. Often, decision making involves autonomous agents that are structured 
in a complex hierarchy, representing e.g. authority. Typically the agents share 
the same body of knowledge, but each may have its own, possibly conflicting, 
preferences on the available information. 

We model the common knowledge base for such preference agents as a logic 
program under the extended answer set semantics, thus allowing for the defeat of 
rules to resolve conflicts. An agent can express its preferences on certain aspects 
of this information using a partial order relation on either literals or rules. Placing 
such agents in a hierarchy according to their position in the decision making 
process results in a system where agents cooperate to find solutions that are jointly 
preferred. 

We show that a hierarchy of agents with either preferences on rules or on literals 
can be transformed into an equivalent system with just one type of preferences. 
Regarding the expressiveness, the formalism essentially covers the polynomial 
hierarchy. E.g. the membership problem for a hierarchy of depth n is S„+ 2 - 
complete. We illustrate an application of the approach by showing how it can 
easily express a generalization of weak constraints, i.e. “desirable" constraints 
that do not need to be satisfied but where one tries to minimize their violation. 



1 Introduction 

In answer set programmingl 16,2] one uses a logic program to modularly describe the 
requirements that must be fulfilled by the solutions to a particular problem, i.e. the 
answer sets of the program must correspond to the intended solutions of the problem. The 
technique has been successfully applied to the area of agents and multi-agent systems[3, 
8,26]. While [3] and [8] use the basic answer set semantics to represent the agents domain 
knowledge, [26] applies an extension of the semantics incorporating preferences among 
choices in a program. 

The idea of extending answer set semantics with some kind of preference relation 
is not new. We can identify two directions for these preferences relations on programs. 
On the one hand, we can equip a logic program with a preference relation on the rules 

* Supported by the FWO. 

** This work was partially funded by the Information Society Technologies programme of the 
European Commission, Future and Emerging Technologies under the IST-200 1-37004 WASP 
project. 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, pp. 44-56, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




Hierarchical Decision Making by Autonomous Agents 



45 



[18,17,15,10,7,5,27,1,22], while on the other hand we can consider a preference relation 
on the (extended) literals in the program: [21] proposes explicit preferences while [4,6] 
encodes dynamic preferences within the program. 

The traditional answer set semantics is not universal, i.e. programs may not have any 
answer sets at all. This behavior is not always feasible, e.g. a route planner agent may 
contain inconsistent information regarding some particular regions in Europe, which 
should not stop it from providing travel directions in general. The extended answer set 
semantics from [22,23] allows for the defeat of problematic rules. Take, for example, 
the program consisting of a A- b, b A- and ->a A- . Clearly this program has no answer 
sets. It has, however, extended answer sets {a, b}, where the rule ->a A- is defeated by 
the applied a A- b, and {-m, b}, where a A- b is defeated by ->a A- . 

However, not all extended answer sets may be equally preferred by the involved 
parties: users traveling in “error free” regions of Europe do not mind faults in answers 
concerning the problematic regions, in contrast to users traveling in these latter regions 
that want to get a “best” approximation. Therefore, we extend the above semantics by 
equipping programs with a preference relation over either the rules or the literals in a 
program. Such a preference relation can be used to induce a partial order on the extended 
answers, the minimal elements of which will be preferred. 

Different agents may exhibit different, possibly contradicting, preferences, that need 
to be reconciled into commonly accepted answer sets, while taking into account the 
relative authority of each agent. 

For example, sending elderly workers on early pension, reducing the wages, or sack- 
ing people are some of the measures that an ailing company may consider. On the other 
hand, management may be asked to sacrifice expense accounts and/or company cars. 
Demanding efforts from the workers without touching the management leads to a bad 
image for the company. Negotiations between three parties are planned: shareholders, 
management and unions. The measures under consideration, together with the influence 
on the company’s image are represented by the extended answer sets 

Mi = {bad .image, pension} M 4 = bad-image , expense, sack} 

M 2 = {bad-image, wages} M 5 = bad-image , car, wages} 

M 3 = j - 1 bad-image , expense, wages} . 

The union representative, who is not allowed to reduce the options of the man- 
agement, has a preference for the pension option over the wages reduction over the 
sacking option of people, not taking into account the final image of the company, 
i.e. pension < wages < sack < {bad -image, ~^bad -image}. This preference strat- 
egy will result in Mi being better than M 2 , while M 3 is preferred upon M4. Fur- 
thermore, M 5 is incomparable w.r.t. the other options. Thus Mi, M 3 and M 5 are the 
choices to be defended by the union representative. Management, on the other hand, 
would rather give up its expense account than its car, regardless of company image, i.e. 
expense < car < {bad -image, ~<bad -image}, yielding Mi, M 3 and M 4 as negotiable 
decisions for the management. 

Finally, the shareholders take only into account the decisions that are acceptable to 
both the management and the unions, i.e. Mi and M 3 , on which they apply their own 
preference -1 bad -image < bad-image, i.e. they do not want their company to get a bad 




46 



S. Heymans, D. Van Nieuwenborgh, and D. Vermeir 



image. As a result, M 3 C Mi, yielding that M 3 is the preferred way to go to save the 
company, taking into account each party’s preferences. 

Decision processes like the one above are supported by agent hierarchies , where a 
program, representing the shared world of agents, is equipped with a tree of preference 
relations on either rules or literals, representing the hierarchy of agents preferences. 
Semantically, preferred extended answer sets for such systems will result from first 
optimizing w.r.t. the lowest agents in the hierarchy, then grouping the results according 
to the hierarchy and let the agents on the next level optimize these results, etc. Thus, 
each agent applies its preferences on a selection of the preferred answers of the agents 
immediately below it in the hierarchy, where the lowest agents apply their preferences 
directly on the extended answer sets of the shared program. 

Reconsidering our example results in the system depicted below, i.e. union and 
management prefer directly, and independently, among all possible solutions, while the 
shareholders only choose among the solutions preferred by both union and management, 
obtaining a preferred solution for the complete system. 




■Ml, .Wj. M4-+ < union M3. Ml 5 



M-l 



^shareholders 



Such agent hierarchies turn out to be rather expressive. More specifically, we show 
that such systems can solve arbitrary complete problems of the polynomial hierarchy. 
We also demonstrate how systems with combined preferences, i.e. either on literals or 
on rules, can effectively be reduced to systems with only one kind of preference. 

Finally, we introduce a generalization of weak constraints [9], which are constraints 
that should be satisfied but may be violated if there are no other options, i.e. violations 
of weak constraints should be minimized. Weak constraints have useful applications in 
areas like planning, abduction and optimizations from graph theory[13,l 1]. We allow 
for a hierarchy of agents having their individual preferences on the weak constraints 
they wish to satisfy in favor of others. We show that the original semantics of [9] can be 
captured by a single preference agent. 

The remainder of the paper is organized as follows. In Section 2, we present the 
extended answer set semantics together with the hierarchy of preference agents, enabling 
hierarchical decision making. The complexity of the proposed semantics is discussed 
in Section 3. Before concluding and giving directions for further research in Section 5, 
we present in Section 4 a generalization of weak constraints and show how the original 
semantics can be implemented. Due to lack of space, detailed proofs have been omitted. 



2 Agent Hierarchies 

We give some preliminaries concerning the extended answer set semantics[22], A literal 
is an atom a or a negated atom -i a. An extended literal is a literal or a literal preceded 





Hierarchical Decision Making by Autonomous Agents 



47 



by the negation as failure- symbol not. A program is a countable set of rules of the form 
a 4— (3 with a a set of literals, \a\ < 1, and (3 a set of extended literals. If a = 0, we 
call the rule a constraint. The set a is the head of the rule while (3 is called the body. 
We will often denote rules either as a <— 0 or, in the case of constraints, as A- (3 . 
For a set X of literals, we take ->X = {->/ | l £ X} where -i-i a is a; X is consistent 
if X fl —>X = 0. The positive part of the body is (3 + = {l \ l £ (3,1 literal}, the 
negative part is (3~ = {l \ not l £ /?}, e.g. for [3 = {a, not ~^b,not c}, we have that 
/ 3 + = {a} and f3~ = {—>6, c}. The Herbrand Base Bp of a program P is the set of 
all atoms that can be formed using the language of P. Let Cp be the set of literals and 
C* P the set of extended literals that can be formed with P, i.e. C p - Bp U —Bp and 
C* P = Cp U {not l | l £ Cp}. An interpretation I of P is any consistent subset of 
Cp. For a literal l, we write I \= l, if l £ I, which extends for extended literals not l 
to I \= not / if / ^ /. In general, for a set of extended literals X, I \= X if I \= x for 
every extended literal x £ X. A rule r : a £- (3 is satisfied w.r.t. I , denoted 7 |= r, if 
I (= a whenever I \= (3, i.e. r is applied whenever it is applicable. A constraint -t— (3 
is satisfied w.r.t. I if I ^ {3. The set of satisfied rules in P w.r.t. I is the reduct Pp For 
a simple program P (i.e. a program without not), an interpretation I is a model of P if 
I satisfies every rule in P, i.e. Pp = P; it is an answer set of P if it is a minimal model 
of P, i.e. there is no model J of P such that J C I. For programs P containing not, we 
define the GL-reduct w.r.t. an interpretation I as P 1 , where P 1 contains a ■£- /3 + for 
a £- f3 in P and f3~ fl I = 0. I is an answer set of P if I is an answer set of P 1 . A 
rule a ■£- f3 is defeated w.r.t. I if there is a competing rule -i a £- 7 that is applied w.r.t. 
I, i.e. {-I a} U 7 Cl. An extended answer set I of a program P is an answer set of Pj 
such that all rules in P\P/ are defeated. 

Example 1. Take a program P expressing an intention to vote for either the Democrats 
or the Greens. Voting for the Greens will, however, weaken the Democrats, possibly 
resulting in a Republican victory. Furthermore, you have a Republican friend who may 
benefit from a Republican victory. 

dem-vote ■£- -> derri-vote £- 

greenjvote 4 — not demjuote rep-win 4 — green-vote 
fr-benefit 4 — rep-win ~>fr -benefit 4 — rep-win 

This program results in 3 different extended answer sets M\ = {dem-vote}, M2 = 
{-> dem-vote , greenjvote , rep -win , fr Cbenefitf , and = {-■ demjuote , green-vote, 

rep-win, -> fr -benefit }. 

As mentioned in the introduction, the background knowledge for agents will be 
described by a program P. Agents can express individual preferences either on extended 
literals or on rules of P, corresponding to literal and rule agents respectively. 

Definition 1 . Let P be a program. A rule agent ( RA ) A for P is a well-founded strict 
partial 1 order < on rules in P. The order < induces a relation C among interpretations 
M and N of P, such that M C N iff\/r2 £ Pn\Pm ■ 3 ri £ Pm\Pn ■ r\ < rp. 

1 A strict partial order on X is an anti-reflexive and transitive relation on A. A strict partial order 
on a finite X is well-founded, i.e. every subset of X has a minimal element w.r.t. <. 




48 



S. Heymans, D. Van Nieuwenborgh, and D. Vermeir 



A literal agent ( LA )for P is a strict well-founded partial order < on C* P , and M d N 
iff\/n € {l € C* P \ N \= l A M \A l}, 3m € {l € C* P \ M \= l A N \f= 1} ■ m < n. 

The extended answer sets of an agent for P correspond to the extended answer sets 
for P. As usual, we have M d N iff M d N and not N d M. A preferred answer set 
M is an extended answer set that is minimal w.r.t. d among the extended answer sets. 

Note that a RA < for P corresponds to an ordered logic program (OLP) (P, <) from 
[ 22 ], 

We refer to the order of an agent A with <_q and Intuitively, for rule agents, 
an extended answer set M is “better” than N if each rule that is satisfied by N but not 
by M is countered by a better rule satisfied by M and not by N. Similarly, for literal 
agents we have that M d N if every extended literal that is true in N, but not in M, is 
countered by a better one true in M but not in N. 

E.g., define a rule agent fr -benefit 4— repjutin < —> fr -benefit 4— rep-win for 
the program P in Example 1, indicating that one rather satisfies the former rule than 
the latter. We have, with Pmj = P\{->derri-Vote <— }, Pm 2 = P\{demo-Vote 4— 
,-ifr -benefit 4— rep-win} and Pm 3 = P\{demo-Vote 4— ,fr -benefit 4— rep-win}, 
that M 2 d M 3 , yielding that M\ and M 2 are the only preferred answer sets. 

A literal agent might insist on voting for the Democrats: demo-vote < C* P \ 
{demojuote}, making M\ its only preferred answer set. 

The cooperation of agents for a program P is established by arranging them in a 
tree-structure 2 , such that decisions are made bottom-up, starting with agents that have 
no successors, all the way up in the hierarchy to the root agent, each agent processing 
the results of its successor agents. Formally, an agent hierarchy (AH) is a pair (P, T) 
where P is a program and T is a finite and/or-tree of agents A for P. 

We will denote the root agent A of the tree T with A e . The m successors of an agent 
A x are denoted as A x -i, ■ ■ ■ ,A x . m .An agent without successors is called an independent 
agent, other agents are dependent. An agent associated with an and-node (or-node) will 
be called an and-agent (or-agent). We define what it means for an extended answer set 
to be preferable by a certain agent in the hierarchy. 

Definition 2. Let (P, T) be an AH. An extended answer set M ofP is preferable by an 
independent agent A ofT ifM is a preferred answer set of A for P. An extended answer 
set M of P is preferable by a dependent and-agent ( or-agent ) A x , with m successors, if 

- M is preferable by every (some) A x -i, 1 < i < m, and 

- there is no N, preferable by every (some) A X j, 1 < J < m, such that N d^ x M- 

An extended answer set M of P is preferred if it is preferable by A e . 

Rule agents (OLPs) are rather convenient to formulate diagnostic problems[24, 25], using 
“normal” and "fault” model rules to describe the system under consideration, where 
the former are preferred over the latter. Examples of this approach can be found in 
[24,25], where it also has been shown that the OLP semantics yields minimal possible 
explanations. However, to decide which explanations to check first, an engineer typically 

2 For simplicity we restrict ourselves to trees, however, the results remain valid for any well- 
founded strict partial order of agents that has a unique maximal agent. 




Hierarchical Decision Making by Autonomous Agents 



49 



uses another preference order, preferring e.g. explanations that are cheaper to verify. Such 
a situation can be modelled by an agent hierarchy containing an extra agent “above” the 
diagnostic RA. More generally, one may imagine situations where multiple engineers 
each have their own experience (expressed by preference) and where the head of the 
group has to take the final decision on which possible explanations to check first, taking 
into account the proposals of her colleagues. Such systems can easily be expressed using 
the proposed framework. 

Reasoning w.r.t. agent hierarchies, containing both rule and literal agents, can be 
reduced to reasoning w.r.t. rule agent hierarchies (RAHs) or literal agent hierarchies 
(LAHs), i.e. hierarchies containing only rule or literal agents. 

We show the reduction from RAs to LAs and vice versa. For the reduction of RAs for 
P to LAs, we introduce for every rule r in P a corresponding atom that is in an answer 
set iff r is satisfied. Intuitively, the newly introduced atoms will be ordered according to 
the original order on the rules they correspond with. 

Theorem 1. Let P be a program and R = {?y 4— not b \ Vi : a 4— (3 £ P,b £ 
(3 + } U {d 4— b | n : a <— (3 £ P, b £ (3~} U {r, 4— a | j*i : a 4— /3 £ P} with a 
new atom r, for each rule r, in P. M is a preferred answer set of a RA A r for P iff 
M' = M U {ri | r, £ Pm} is a preferred answer set of the LA A 1 for PUR where 
{ r i} <_ 4 i Cp <_ 4 i not(Cpun) with additionally ri <^i rj ijf r t <^r r.j . 

Moreover, preferred answer sets of a LA for P are in one to one correspondence 
with the preferred answer sets of a LA for P U R by simply ignoring the newly added 
atoms r'i . 

Theorem 2. Let P be a program and R as in Theorem 1. M is a preferred answer set 
of a LA A for P iff M' = M U {r, | £ Pm} is a preferred answer set of the LA 

A for P U R where <a' is equal to < ^ with additionally k <. 4 ' C* PUR \C* P for every 
extended literal k appearing in <_ 4 . 

The opposite simulation of a LA by a RA can be done by introducing for each literal l 
and its extended version not l rules V -u- and -i Z' <— and ordering those rules according 
to the order on the extended literals. 

Theorem 3. Let P be a program and L = {V u- •< — | Z £ £p}U{ <— l', not l ; <— 

pI', l | l £ Cp}. M is a preferred answer set of a LA A 1 for P iff M' — M U {(— i )/ 7 | 
l £ Cp, M |= ( not)l } is a preferred answer set of the RA A r for PUL with L <_ 4 r- P 
and additionally (—)l' U- <_ 4 >- ( ->)k ' 4— iff {not) l <_ 4 i {not)k. 

Example 2. Take a LA A 1 for P where P consists of the rules b 4— a, a 4— , and -i a 4— , 
and -i a <_ 4 i {a, b , not ->a}. This agent has two extended answer sets { _, a} and {a, b}, 
of which the first one is preferred. The corresponding RA AT is defined by the following 
program 3 

b 4 — ft ft i — “ >ft i — 

a' 4- b' 4- (- , ft) / 4- {-ib)' 4 — 

—i a' 4— —ib 1 4— —i{—ia,y 4— —i{—ib)' 4— 

4 — a' , not ft 4^b',?iotb 4— not -<a 4— (—ib)’ , not ~<b 

4^—ia',a 4r- — ib', b 4— —i{—ia)' ,—<a 4^ —i{—ib)' , —ib 

3 Rules below the line are smaller than the ones above w.r.t. <A r ■ 




50 



S. Heymans, D. Van Nieuwenborgh, and D. Vermeir 



and (-ia)' 4— <A r {a' V- , b' 4— , ->(->a)' 4— }. This RA has the preferred answer set 
{“id, (~ 'd) / , —i <a/ , — $ , — '( — 1 &) / } = { — 'Ci} U {( — 1)1' | { — 'a.} |= (not)T\. 

Similarly to Theorem 2, we have that RAs for P can be simulated by RAs for PUL. 

Theorem 4. Let P be a program and L as in Theorem 3. M is a preferred answer set 
of a RA A for P iff M' = M U { (-')(' \ l £ Cp, M \= ( not)l } is a preferred answer set 
of a RA A' for P U L where <A' A equal to <a with additionally r <a' L for every r 
appearing in <a- 

Theorem 1 and 2 allow the simulation of an arbitrary AH by a LAH. This is done by 
extending the program P with the set of rules R as in Theorem 1 , and by transforming the 
rule agents to literal agents (Theorem 1), while the literal agents are adapted according 
to Theorem 2. 

Theorem 5. Let (P, T) be an AH. M is a preferred answer set of (P, T) iff M' = 
M U {fj | r'i £ Pm} is a preferred answer set of the LAH ( P U R, T'), with T' defined 
as T but with every rule or literal agent replaced by a literal agent as in Theorems 1 and 2. 

Similarly, but now with Theorems 3 and 4, we can reduce arbitrary AHs to RAHs. 

Theorem 6. Let (P, T) be an AH. M is a preferred answer set of (P, T) iff M' = 
M U {(-.)/' | l £ Cp, M \= ( not)l } is a preferred answer set of the RAH (P U L , T'), 
with T' defined as T but with every rule or literal agent replaced by a rule agent as in 
Theorems 3 and 4. 



3 Complexity 

We briefly recall some relevant notions of complexity theory (see e.g. [20,2] for a nice 
introduction). The class P (NP) represents the problems that are deterministically (non- 
deterministically) decidable in polynomial time, while coNP contains the problems 
whose complement are in NP. 

The polynomial hierarchy, denoted PH, is made up of three classes of problems, 
i.e. A?, E e and njf, k > 0, which are defined as Aq = Eq = n E = P, Ajf +1 = 

P s * , Ejf +1 = NP S >= , and njf +1 = coE£ +1 . The class P E * ( NP Sk ) represents 
the problems decidable in deterministic (nondeterministic) polynomial time using an 
oracle for problems in Ej}, where an oracle is a subroutine capable of solving Ej.' 
problems in unit time. The class PH is defined by PH = UfcLo ■ Note that E}f C 
Ejf U n E C Affj, C E E +1 . In the following, we will usually omit the P-superscript 
to avoid cluttered up lines. A language L is called complete for a complexity class C if 
both L is in C and L is hard for C. Showing that L is hard is normally done by reducing 
a known complete decision problem to a decision problem in L. 

First of all, checking whether an interpretation I is an extended answer set of a 
program P is in P, because (a) checking if each rule in P is either satisfied or defeated 
w.r.t. I, (b) applying the GL-reduct on Pj w.r.t. I, i.e. computing (Pi) 1 , and (c) checking 
whether the positive program (Pi) 1 has / as its unique minimal model, can all be done 
in polynomial time. 




Hierarchical Decision Making by Autonomous Agents 



51 



For an agent A and a program P , checking whether M is not a preferred answer set 
is in NP, because one can guess a set N [Z _4 M in polynomial time, and subsequently 
verify that N is an extended answer set of P, which can also be done in P. 

On the other hand, the complexity of checking whether an extended answer set M 
is not preferable by a certain agent A x in a hierarchy (P, T) depends on the location of 
the agent in the tree T. For an agent A x in T, we denote with d(A x ) the length of the 
longest path from A x to an independent agent A x . y , y £ N*, i.e. d(A x ) = max^ \y\ 
over independent agents A x . y , where \y\ is the length of the string y. We dehne the depth 
of T as the longest path from the root, i.e. d(T) = d(A e ). 

Lemma 1 . Let (P, T) be an AH 4 , and let M be an extended answer set of P. Checking 
whether M is not preferable by A x is in £d{A x )+ 1- 

Proof. The proof is by induction. In the base case, i.e. A x is an independent agent, we 
have that d(A x ) = 0. Checking whether M is not preferable by A x means checking 
whether M is not a preferred answer set of the agent A x for P, which is in NP = £ i = 

£d(A x )+ 1 - 

For the induction step, checking that M is not preferable by a dependent and-agent 
(or-agent) A x with m successors can be done by (a) checking that M is (or is not) 
preferable by every (some) A x .i, 1 < i < m. Since checking whether M is (or is not) 
preferable by an A x .i can be done, by the induction hypothesis, in £d(A x .i)+ 1 > we have 
that checking whether M is preferable by an A x .i is also in C = £ ma x 1<i<m d(A x .i)+ i> 
and (b) guessing, if M is preferable by every (some) A x .i, 1 < i < m, an interpretation 
N C_ 4 x M and checking that it is not the case that N is not preferable by every (some) 
A x .i, 1 < i < m, which is again in C due to the induction hypothesis. 

As a result, at most 2m calls are made to a 6' -oracle and at most one guess is made, 
yielding that the problem itself is in NP C = iC m ax 1 < i < m d{A x .i)+ l+i = s d(A x )+i- D 

Using the above yields the following theorem about the complexity of AHs. 

Theorem 7 . Let (P, T) be an AH and l a literal. Deciding whether there is a preferred 
answer set containing l is in £d.(T)+2- Deciding whether every preferred answer set 
contains l is in Pd(T)+ 2- 

Proof. The first task can be performed by an iVP-algorithm that guesses an interpretation 
M D l and checks that it is not the case that M is not preferable up to the root agent A e . 
Due to Lemma 1, the latter is in £d(A e )+i = ^d(T)+ 1 > so th e former is in NP 4Jd(I ">+ 1 = 
^d(T)+ 2 - 

By the previous, finding a preferred answer set M not containing l, i.e. I M, is in 
£d(T)+2- Hence, the complement of the problem is in 77<2 (t)+2- □ 

Consider a LAH (P, T) where the tree T is a linear order containing n literal agents 
{A s , Ai, An, ■ ■ ■ A11...1}, i.e. a linear LAH. Deciding whether there is a preferred an- 
swer set of a linear LAH, containing a literal, is £ n +'\ -complete, i.e. £,i(T)-. 2 -complete, 
as is shown in [19]. Furthermore, deciding whether every preferred answer set of a lin- 
ear LAH contains a literal, is // rf ( r)+2 -complete [19]. Hardness for AHs follows then 
immediately from the hardness of linear LAHs. 

4 The depth of the tree is assumed to be bounded by a constant. 




52 



S. Heymans, D. Van Nieuwenborgh, and D. Vermeir 



Theorem 8. The problem of deciding, given an AH (P, T) and a literal l, whether there 
exists a preferred answer set containing l is Sd(T)+ 2 -hard. Deciding whether every 
preferred answer set contains l is n ,i(T)+ 2 ~h a rd. 

Proof. Checking whether there is a preferred answer set containing l for a linear LAH 
(P, T) is AJ r i(T).. 2 -complete, and since a linear LAH is a AH, the result follows. 

The second problem can be similarly shown to be Ifi(T) 1 - 2 -hard. □ 

The following is immediate from Theorem 7 and 8. 

Corollary 1. The problem of deciding, given an arbitrary AH ( P,T ) and a literal l, 
whether there is a preferred answer set containing l is S d pp^ + 2 -complete. On the other 
hand, deciding whether every preferred answer set contains l is n d pp^ + 2 -c o mplete. 

4 Relationship with Weak Constraints 

Weak constraints were introduced in [9] as a relaxation of the concept of a constraint. 
Intuitively, a weak constraint is allowed to be violated, but only as a last resort, meaning 
that one tries to minimize the number of violated constraints. Additionally, weak con- 
straints may be hierarchically layered by means of a totally ordered set of sets of weak 
constraints W = {Wi, W 2 , ■ ■ ■ , W „ }, where it is assumed that W t < Wj+i, 1 < i < n, 
if the weak constraints in W t are more important than the ones in W l+ i. Intuitively, one 
first chooses the answer sets that minimize the number of violated constraints in the 
most important W \ , and then, among those, one chooses the extended answer sets that 
minimize the number of violated constraints in W 2 , etc. 

Formally, a weak logic program (WLP) is a pair (P, W) where P is a program, 
and W is a totally ordered set of sets of weak constraints, specified syntactically as 
constraints •<— (3. 

Definition 3. Let (P, W = {Wi , . . . , W n } ) be a WLP. An extended answer set M ofP 
is preferable up to W\ if no extended answer set N of P exists such that | | < | |, 

where are the weak constraints in Wi that are violated by an interpretation X. An 
extended answer set M of P is preferable up to Wi, 1 < i < n, if 

- M is preferable up to Wi— 1 , and 

- there is no N, preferable up to Wi-\, such that \ V^,. \ < | Vyy. I- 

An extended answer set M of P is preferred if it is preferable up to W n . 

In [9] a Datalog lwr - program LP is used 5 disallowing empty heads and classical 
negation, but allowing for a set of strong constraints S. Clearly, this is subsumed by 
Definition 3, by taking P = LP U S, and noting that the extended answer set semantics 
reduces to the answer set semantics, due to the absence of classical negation. Although 
the preferred models are defined in [9] by means of an object function that has to be 
minimized, they are equivalent [12] to the ones resulting from Definition 3. 

5 The general mechanism is introduced with Datalog' J ’ not - programs, which allow for disjunction 
in the head. 




Hierarchical Decision Making by Autonomous Agents 



53 



The semantics of weak constraints, with preferability up to certain levels, appears 
very similar to our preferability notion in an agent hierarchy. However, due to the use 
of cardinality, deciding whether a literal l is contained in some preferred answer set of 
a WLP is Z\ -complete. As A. 1 / C }j! 2 ‘ , agent hierarchies of depth 0 suffice to capture 
WLPs. More specifically, we show that a single agent can solve the problem. 

Example 3. Take the weak logic program ( P , {HA , HA }) , with the program P consisting 
of rules a <— , ~>a <— , 6 <— , and -<b 3— , and HA = { 3— a}, W 2 = { 3— • ui. ■£- 
We have 4 extended answer sets Mi = {a, b}, M 2 = {a, - 16 }, M 3 = {-> 0 , 6 }, 
and M 4 = {-ia, “ 16 } of which M 3 and M 4 are preferable up to HA, and only M 3 is 
preferable up to W 2 . Indeed | \ = \ V% \ = 1, | V% \ = \ V% \ = 0, | V% \ = 0, 

I Vwl I = I ^ We I = an d I I = 2- We define the set WC as the rules c\ <— a, 
cf A- -1 a, and cf A- -<b, identifying the weak constraints and the level on which they 
appear, and rules counting the number of violated constraints in a HA, for 0 < l < k, 

co(l , 0, wi) <r- not c\ co(2, l, W 2 ) 3— co(l , l, Wz) 1 not cf 
Co(l , 1 , Wi) <— c\ co(2 , l + 1 , Wz) 3— co(l , l, wz), cf 

Intuitively, the third argument in a co/3 literal identifies the particular HA we are looking 
at, the first argument shows the number of constraints in HA that have already been con- 
sidered, and the second argument effectively counts the number of violated constraints 
in HA- Further, WC also contains the rules defining the number of violated constraints 
in each set of weak constraints, i £ { 0 , 1 }, j £ { 0 , 1 , 2 }: co(i, wj) ■£- co(l , i, wi ) and 
co(j,w 2 ) £- co(2,j,w z ). 

The order < on literals is defined as follows co(0 ,wj) < co(l , up ) < co(0,Wz) 
< co(l , w 2 ) < co(2 , w 2 ) < R, with R the extended literals C* P[J wc without the co/2 
atoms. Intuitively, the w\ constraints are more important than the w 2 constraints, and 
hence appear below them, and, among each vj-i , one rather has a low count than a high 
count, since this implies less violated constraints. One can check that the preferred 
answer set of the literal agent A =< for P U WC is M 3 = M 3 U (cf , co(l, 0, 
co(l, 1 , w 2 ), co( 2 , 1 , w 2 ), co( 0 , wi), co(l, w 2 )}. 

Formally, we have the following result, where the weak constraints in a Wj are assumed 
to be numbered and explicitly tagged with a superscript identifying Wj, i.e. Wj = {a- 

01 > • • • > Pi,}- 

Theorem 9. Let ( P , W = {HA, . . . , W n }) be a weak logic program. M is a preferred 
answer set of ( P , W) iff, for all l < j < n, 

M' = M U (c{ | M \= Pj} U{co(l, a, Wj) \ c{ ^ M' => a = 0, c( £ M' => a = 1} 
U{co(fc + 1, a, Wj) | co(k, l, Wj) £ M' , 0 < l < k < |Wj|A 

[<4+i ^ M' => a = l, 4 +1 £ M' => a = 1+ 1]} 
U{co(m,Wj) | co(\Wj\,m,Wj) £ M'} 

is a preferred answer set of the literal agent Afor PU WC where WC,foralll < j < n, 
consists of the rules c\ £- 0, co(l , 0 , Wj) ■£- not c{, co(l , 1 , Wj) ■£- c{ and 

co(k +1,1, Wj) ■£- co(k, l, Wj),not c 3 k+1 with 0 < l < k < \Wj\ 
co(k + 1,1+1, Wj) £- co(k, l, Wj), c J k+1 




54 



S. Heymans, D. Van Nieuwenborgh, and D. Vermeir 



together with the rules co/m, wf) 4- co( | Wj\, m, wf), 0 < m < \Wj\, and the order 
< A defined as co{0,wi) <a co{1, wi) <a ■■■ <A co(| Wi |, wi) <a <a 
Co/0,W n ) <A Co(l , Wn) <A ■■■ <A Co/\ W n \,W n ). 

A more general approach, in the spirit of rule and literal agents, is to allow agents 
to prefer the satisfaction of certain weak constraints over the satisfaction of other ones. 
A weak logic program then becomes a pair (P, W), where P is a program and IP is a 
set of constraints. A weak agent for (P, W) corresponds to a well-founded strict partial 
order on W, which induces an order C among interpretations M and N of P such that, 
M C N iff Vu >2 £ Wn\Wm ■ 3u>i £ Wm\Wn ■ Wi < w 2 , where Wx are the weak 
constraints in W that are satisfied by X, mirroring Definition 1 for rule agents (note 
that the latter are different from weak agents since RAs require the satisfaction of all 
constraints in all extended answer sets). 

The extended answer sets of P are, by definition, the extended answer sets of a weak 
agent A for (P, W), and preferred answer sets are defined as the minimal extended 
answers sets w.r.t. C. Note that a preferred answer set M of a weak agent A for (P, W) 
has a minimal set of violated constraints, i.e. there is no extended answer set N of A 
such that W\W N C W\W M . 

For a program P, define the extended program E/P) as P with the rules a 4- (3 
replaced by a 4— 13, not a . From Theorem 4 in [22] we have that the extended answer 
sets of P are exactly the answer sets of E(P). We can then rewrite a weak agent as a 
rule agent by introducing for each weak constraint w : 4— / 3 rules w -t— (3 and ->w 4— (3 
such that w is in an answer set if the constraint is violated. 

Theorem 10. Let A w be a weak agent for a WLP (P, W). M is a preferred answer set 
of A w for (P, W) iffM' = M U {w | w £ W, M \t= w} is a preferred answer set of the 
RA A r for E(JP) U WC with WC = {w 4— /?; ->w 4-/3] 4-/3, not w \ w : 4— j3 £ W} 
and ->W! 4- Pi < A - ~^w 2 4- (3 Z iff w\ w 2 . 

Moreover, weak agents are as expressive as rule agents. 

Theorem 11. Let P be a program and A r a RA for P. M is a preferred answer set of 
A r for P iff M is a preferred answer set of the weak agent A w for the WLP (P, W), 
with W = { 4— (3, not a \ a 4— /3 € P} and 4- Pi, not <a w 4- P 2 ,not a 2 iff 
0.1 4- Pi a 2 4— p 2 . 

Weak agents, placed in a hierarchy, then allow for an intuitive decision making process 
based on satisfaction and violation of weak constraints. The complexity of weak agent 
hierarchies can easily be deduced from the reductions from and to rule agent hierarchies, 
with Theorem 10 and 1 1 and their extensions for hierarchies. 

Theorem 12. The problem of deciding, given a weak agent hierarchy ((P, W), T) and 
a literal l, whether there is a preferred answer set containing l is £ <i(T)+2- com plete. 
On the other hand, deciding whether every preferred answer set contains l is Pd(T)+ 2 ' 
complete. 

5 Conclusions and Directions for Further Research 

In this paper, we introduced a system suitable to model hierarchical decision making. 
We equip agents with a preference relation on the available knowledge and allow them to 




Hierarchical Decision Making by Autonomous Agents 



55 



cooperate with each other in a hierarchical fashion. Preferred solutions of these systems 
naturally correspond to preferred decisions regarding the problem. 

Initially, we defined two types of preference agents: rule agents express a preference 
over rules, while literal agents use a preference over extended literals they rather prefer 
upon others in a solution. We showed that mixed AHs, containing both types of agents, 
can be reduced to hierarchies consisting only of rule or literal agents. It turns out that 
these AHs cover the polynomial hierarchy. 

Finally, we showed that layered weak constraints can be easily simulated by a single 
agent. Furthermore, we generalized the concept of layered weak constraints to weak 
agent hierarchies, which are equivalent to rule agent hierarchies. 

Future work comprises a dedicated implementation of the approach, using existing 
answer set solvers. E.g., we could generate an extended answer set which is then im- 
proved recursively by a set of augmented programs, corresponding to the agents in the 
hierarchy, generating strictly better solutions. A fixpoint of this procedure then corre- 
sponds to a preferred answer set of the system. 



References 

1. Jose Julio Alferes and Luis Moniz Pereira. Updates plus preferences. In Manual Ojeda- 
Aciego, Inma P. de Guzman, Gerhard Brewka, and Luiz Moniz Pereira, editors, European 
Workshop, JELIA 2000, volume 1919 of Lecture Notes in Artificial Intelligence, pages 345- 
360. Malaga, Spain, September-October 2000. Springer Verlag. 

2. Chitta Baral. Knowledge Representation, Reasoning and Declarative Problem Solving. Cam- 
bridge Press, 2003. 

3. Chitta Baral and Michael Gelfond. Reasoning agents in dynamic domains. In Logic-based 
artificial intelligence, pages 257-279. Kluwer Academic Publishers, 2000. 

4. G. Brewka. Logic programming with ordered disjunction. In Proceedings of the 1 8th National 
Conference on Artificial Intelligence and Fourteenth Conference on Innovative Applications 
of Artificial Intelligence, pages 100-105, Edmonton, Canada, July 2002. AAAI Press. 

5. Gerhard Brewka and Thomas Eiter. Preferred answer sets for extended logic programs. 
Artificial Intelligence, 1 09( 1 -2):297— 356, April 1999. 

6. Gerhard Brewka. Ilkka Niemela, and Tommi Syrjanen. Implementing ordered disjunction 
using answer set solvers for normal programs. In Flesca et al. [14], pages 444^155. 

7. Francesco Buccafurri, Wolfgang Faber, and Nicola Leone. Disjunctive logic programs with 
inheritance. In Danny De Schreye, editor. Logic Programming: The 1999 International 
Conference, pages 79-93, Las Cruces. New Mexico, December 1999. MIT Press. 

8. Francesco Buccafurri and Georg Gottlob. Multiagent compromises, joint fixpoints, and stable 
models. In Antonis C. Kakas and Fariba Sadri, editors, Computational Logic: Logic Program- 
ming and Beyond, Essays in Honour of Robert A. Kowalski, Part I, volume 2407 of Lecture 
Notes in Computer Science, pages 561-585. Springer, 2002. 

9. Francesco Buccafurri, Nicola Leone, and Pasquale Rullo. Strong and weak constraints in dis- 
junctive datalog. In Proceedings of the 4th International Conference on Logic Programming 
(LPNMR '97), pages 2-17, 1997. 

10. Francesco Buccafurri, Nicola Leone, and Pasquale Rullo. Disjunctive ordered logic: Seman- 
tics and expressiveness. In Anthony G. Cohn, Lenhard K. Schubert, and Stuart C. Shapiro, 
editors, Proceedings of the 6th International Conference on Principles of Knowledge Repre- 
sentation and Reasoning, pages 418-431, Trento, June 1998. Morgan Kaufmann. 




56 



S. Heymans, D. Van Nieuwenborgh, and D. Vermeir 



11. Francesco Buccafurri, Nicola Leone, and Pasquale Rullo. Enhancing disjunctive datalog by 
constraints. Knowledge and Data Engineering , 12(51:845-860, 2000. 

12. Wolfgang Faber. Disjunctive datalog with strong and weak constraints: Representational and 
computational issues. Master’s thesis, Institut for Informationssysteme, Technische Univer- 
sitatWien, 1998. 

13. Wolfgang Faber, Nicola Leone, and Gerald Pfeifer. Representing school timetabling in a 
disjunctive logic programming language. In Proceedings of the 13th Workshop on Logic 
Programming (WLP ’98), 1998. 

14. Sergio Flesca, Sergio Greco, Nicola Leone, and Giovambattista Ianni, editors. European 
Conference on Logics in Artificial Intelligence (JELIA '02 ), volume 2424 of Lecture Notes in 
Artificial Intelligence, Cosenza, Italy, September 2002. Springer Verlag. 

15. D. Gabbay, E. Laenens, and D. Vermeir. Credulous vs. Sceptical Semantics for Ordered 
Logic Programs. In J. Allen, R. Fikes, and E. Sandewall, editors. Proceedings of the 2nd 
International Conference on Principles of Knowledge Representation and Reasoning, pages 
208-217, Cambridge, Mass, 1991. Morgan Kaufmann. 

1 6. Michael Gelfond and Vladimir Lifschitz. The stable model semantics for logic programming. 
In Robert A. Kowalski and Kenneth A. Bowen, editors, Logic Programming , Proceedings of 
the Fifth International Conference and Symposium, pages 1070-1080, Seattle, Washington, 
August 1988. The MIT Press. 

17. Robert A. Kowalski and Fariba Sadri. Logic programs with exceptions. In David H. D. 
Warren and Peter Szeredi, editors, Proceedings of the 7th International Conference on Logic 
Programming, pages 598-613, Jerusalem, 1990. The MIT Press. 

18. Els Laenens and Dirk Vermeir. A logical basis for object oriented programming. In Jan 
van Eijck, editor, European Workshop, JELIA 90, volume 478 of Lecture Notes in Artificial 
Intelligence, pages 317-332, Amsterdam, The Netherlands, September 1990. Springer Verlag. 

19. Davy Van Nieuwenborgh, Stijn Heymans, and Dirk Vermeir. On programs with linearly 
ordered multiple preferences, 2004. Accepted at ICLP '04. 

20. Christos H. Papadimitriou. Computational Complexity. Addison Wesley, 1994. 

21. Chiaki Sakama and Katsumi Inoue. Representing priorities in logic programs. In Michael J. 
Maher, editor, Proceedings of the 1996 Joint International Conference and Symposium on 
Logic Programming, pages 82-96, Bonn, September 1996. MIT Press. 

22. Davy Van Nieuwenborgh and Dirk Vermeir. Preferred answer sets for ordered logic programs. 
In Flesca et al. [14], pages 432-443. 

23. Davy Van Nieuwenborgh and Dirk Vermeir. Order and negation as failure. In Catuscia 
Palamidessi, editor, ICLP, volume 2916 of Lecture Notes in Computer Science, pages 194- 
208. Springer, 2003. 

24. Davy Van Nieuwenborgh and Dirk Vermeir. Ordered diagnosis. In Proceedings of the 10th 
International Conference on Logic for Programming, Artificial Intelligence, and Reasoning 
(LPAR2003), volume 2850 of Lecture Notes in Artificial Intelligence, pages 244—258, Almaty, 
Kazachstan, 2003. Springer Verlag. 

25. Davy Van Nieuwenborgh and Dirk Vermeir. Ordered programs as abductive systems. In Pro- 
ceedings of the APPIA-GULP-PRODE Conference on Declarative Programming (AGP2003 ), 
pages 374-385, Regio di Calabria, Italy, 2003. 

26. Marina De Vos and Dirk Vermeir. Logic programming agents playing games. In Research and 
Development in Intelligent Systems XIX (ES2002), BCS Conference Series, pages 323-336. 
Springer- Verlag, 2002. 

27. Kewen Wang, Lizhu Zhou, and Fangzhen Lin. Alternating fixpoint theory for logic programs 
with priority. In Proceedings of the First International Conference on Computational Logic 
(CL2000), volume 1861 of Lecture Notes in Computer Science, pages 164-178, London, UK, 
July 2000. Springer. 




Verifying Communicating Agents 
by Model Checking in a Temporal Action Logic* 



Laura Giordano 1 , Alberto Martelli 2 , and Camilla Schwind 3 

1 Dipartimento di Informatica, Universita del Piemonte Orientale, Alessandria 
2 Dipartimento di Informatica, Universita di Torino, Torino 
3 MAP, CNRS, Marseille, France 



Abstract. In this paper we address the problem of specifying and veri- 
fying systems of communicating agents in a Dynamic Linear Time Tem- 
poral Logic (DLTL). This logic provides a simple formalization of the 
communicative actions in terms of their effects and preconditions. Fur- 
thermore it allows to specify interaction protocols by means of temporal 
constraints representing permissions and commitments. Agent programs, 
when known, can be formulated in DLTL as complex actions (regular 
programs). The paper addresses several kinds of verification problems 
including the problem of compliance of agents to the protocol, and de- 
scribes how they can be solved by model checking in DLTL using au- 
tomata. 



1 Introduction 

The specification and the verification of the behavior of interacting agents is one 
of the central issues in the area of multi-agent systems. In this paper we address 
the problem of specifying and verifying systems of communicating agents in a 
Dynamic Linear Time Temporal Logic (DLTL). 

The extensive use of temporal logics in the specification and verification of 
distributed systems has led to the development of many techniques and tools 
for automating the verification task. Recently, temporal logics have gained at- 
tention in the area of reasoning about actions and planning [2,10,12,17,5], and 
they have also been used in the specification and in the verification of systems 
of communicating agents. In particular, in [21] agents are written in MABLE, 
an imperative programming language, and the formal claims about the system 
are expressed using a quantified linear time temporal BDI logic and can be au- 
tomatically verified by making use of the SPIN model checker. Guerin in [13] 
defines an agent communication framework which gives agent communication a 
grounded declarative semantics. In such a framework, temporal logic is used for 
formalizing temporal properties of the system. 

In this paper we present a theory for reasoning about communicative actions 
in a multiagent system which is based on the Dynamic Linear Time Temporal 

* This research has been partially supported by the project PRIN 2003 “Logic-based 
development and verification of multi-agent systems”, and by the European Com- 
mission within the 6th Framework Programme project REWERSE number 506779 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, pp. 57—69, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




58 



L. Giordano, A. Martelli, and C. Schwind 



Logic ( DLTL ) [15], which extends LTL by strengthening the until operator by 
indexing it with the regular programs of dynamic logic. As a difference with 
[21] we adopt a social approach to agent communication [1,7,19,13], in which 
communicative actions affect the “social state” of the system, rather than the 
internal (mental) states of the agents. The social state records social facts, like 
the permissions and the commitments of the agents. The dynamics of the system 
emerges from the interactions of the agents, which must respect these permissions 
and commitments (if they are compliant with the protocol) . The social approach 
allows a high level specification of the protocol, and does not require the rigid 
specification of the allowed action sequences. It is well suited for dealing with 
“open” multiagent systems, where the history of communications is observable, 
but the internal states of the single agents may not be observable. 

Our proposal relies on the theory for reasoning about action developed in 
[10] which is based on DLTL and which allows reasoning with incomplete initial 
states and dealing with postdiction, ramifications as well as with nondetermin- 
istic actions. It allows a simple formalization of the communicative actions in 
terms of their effects and preconditions as well as the specification of an inter- 
action protocol to constrain the behaviors of autonomous agents. 

In [11] we have presented a proposal for reasoning about communicating 
agents in the Product Version of DLTL, which allows to describe the behavior 
of a network of sequential agents which coordinate their activities by performing 
common actions together. Here we focus on the non-product version of DLTL, 
which appears to be a simpler choice and also a more reasonable choice when a 
social approach is adopted. In fact, the Product Version of DLTL does not allow 
to describe global properties of a system of agents, as it keeps the local states of 
the agents separate. Instead, the ’’social state” of the system is inherently global 
and shared by all of the agents. Moreover, we will see that the verification tasks 
described in [11] can be conveniently represented in DLTL without requiring the 
product version. The verification of the compliance of an agent to the protocol, 
the verification of protocol properties, the verification that an agent is (is not) 
respecting its social facts (commitments and permissions) at runtime are all 
examples of tasks which can be formalized either as validity or as satisfiability 
problems in DLTL. Such verification tasks can be automated by making use 
of Biichi automata. In particular, we make use of the tableau-based algorithm 
presented in [9] for constructing a Biichi automaton from a DLTL formula. The 
construction of the automata can be done on-the-fly, while checking for the 
emptiness of the language accepted by the automaton. As for LTL, the number 
of states of the automata is, in the worst case, exponential in the size of the 
input formula. 



2 Dynamic Linear Time Temporal Logic 

In this section we shortly define the syntax and semantics of DLTL as introduced 
in [15]. In such a linear time temporal logic the next state modality is indexed 




Verifying Communicating Agents by Model Checking 



59 



by actions. Moreover, (and this is the extension to LTL) the until operator is 
indexed by programs in Propositional Dynamic Logic (PDL). 

Let £ be a finite non-empty alphabet. The members of £ are actions. Let 
£* and £ u be the set of finite and infinite words on £, where to = {0, 1 , 2 ,.. .}. 
Let £°° =£*U£ U1 . We denote by er, a' the words over £ u and by r, r' the words 
over £* . Moreover, we denote by < the usual prefix ordering over £* and, for 
u G £°° , we denote by prf(u) the set of finite prefixes of u. 

We define the set of programs (regular expressions) Prg(£) generated by £ 
as follows: 



Prg(£) ::= a | 7Ti + tt 2 | 7Ti; tt 2 | n* 

where a G £ and 711 , 7r 2 , n range over Prg(£). A set of finite words is associated 
with each program by the mapping [[]] : Prg(£) — > 2 s , which is defined as 
follows: 

— [H = {«}; 

— [kr + Trj^NuN; 

— [[tti; 7t 2 ]] = {tit 2 I Ti G [[7Ti]] and r 2 G [[vr 2 ]]}; 

” [[tt*]] = UIM], where 

• [H] = {4 

• [[7r i+1 ]] = {rir 2 | n G [[7r]] and r 2 G [[tt 1 ]] }, for every iGw. 

Let V = {pi,p 2 , ■ ■ ■} be a countable set of atomic propositions. The set of 
formulas of DLTL(JC) is defined as follows: 

DLTL(r) ::= p \ \ a V (3 \ aU* (3 

where p G V and a, (3 range over DLTL(L7). 

A model of DLTL(i7) is a pair M = (a, V) where a G £ u and V : prf(a) — > 
2 V is a valuation function. Given a model M = (a, V), a finite word r G prf(a) 
and a formula a, the satisfiability of a formula a at r in M, written M, t \= a, 
is defined as follows: 

— M, t |= p iff p G V(r); 

— M,t \= ->a iff M, t a; 

— M , r (= a V /3 iff M, r |= a or M, t \= (3\ 

— M,t \= aU™/3 iff there exists t' G [[7t]] such that rr' G prf(a) and M, tt 1 |= 
(3. Moreover, for every t" such that £ < t" < r' 1 , M,tt" |= a. 

A formula a is satisfiable iff there is a model M = ( a,V ) and a finite word 
r G prf(cr) such that M, r \= a. 

The formula aU K /3 is true at r if “a until (3” is true on a finite stretch of 
behavior which is in the linear time behavior of the program tt. 

The derived modalities (tt) and [7r] can be defined as follows: (n)a = T U^a 
and [tt \a = -i(7r)-ia. 

1 We define r < t' iff 3t" such that tt" = t . Moreover, r < t' iff r < t' and r ^ t . 




60 



L. Giordano, A. Martelli, and C. Schwind 



Furthermore, if we let £ = {ai, . . . , a n }, the U , O (next), O and □ operators 
of LTL can be defined as follows: Oa = \f aeS {a)a, aU/3 = aU E (3, Oa = T Ua, 
□a = -iO-ia, where, in , £ is taken to be a shorthand for the program 
ai + . . . + a n . Hence both LTL(A) and PDL are fragments of DLTL(A). As 
shown in [15], DLTL(A) is strictly more expressive than LTL(A). In fact, DLTL 
has the full expressive power of the monadic second order theory of oc-sequences. 

3 Action Theories 

In this section we recall the action theory developed in [10] that we use for 
specifying the interaction between communicating agents. 

Let V be a set of atomic propositions, the fluent names. A fluent literal l is 
a fluent name / or its negation ->/. Given a fluent literal l, such that l = f or 
l = —>_/■, we define |^| = /. We will denote by Lit the set of all fluent literals. 

A domain description D is defined as a tuple (IT, C), where IT is a set of 
action laws and causal laws, and C is a set of constraints. 

Action laws in IT have the form: □(« — > [a]/3), with a € £ and a,/3 arbitrary 
formulas, meaning that executing action a in a state where precondition a holds 
causes the effect (3 to hold. 

Causal laws in IT have the form: □((aAO/?) —> Ot)> meaning that if a holds 
in a state and (3 holds in the next state, then 7 also holds in the next state. Such 
laws are intended to expresses “causal” dependencies among fluents. 

Constraints in C are arbitrary temporal formulas of DLTL. In particular, the 
set of constraints C contains all the temporal formulas which might be needed 
to constrain the behaviour of a protocol, including the value of fluents in the 
initial state. The set of constraints C also includes the precondition laws. 

Precondition laws have the form: □(«—>• [a]_L), meaning that the execution 
of an action a is not possible if a holds (i.e. there is no resulting state following 
the execution of a if cc holds). Observe that, when there is no precondition law 
for an action, the action is executable in all states. 

Action laws and causal laws describe the changes to the state. All other 
fluents which are not changed by the actions are assumed to persist unaltered 
to the next state. To cope with the frame problem, the laws in IT, describing 
the (immediate and ramification) effects of actions, have to be distinguished 
from the constraints in C and given a special treatment. In [10], to deal with 
the frame problem, a completion construction is defined which, given a domain 
description, introduces frame axioms for all the frame fluents in the style of the 
successor state axioms introduced by Reiter [18] in the context of the situation 
calculus. The completion construction is applied only to the action laws and 
causal laws in IT and not to the constraints. In the following we call Comp(II) 
the completion of a set of laws IT and we refer to [10] for the details on the 
completion construction. 

Test actions allow the choice among different behaviours to be controlled. 
As DLTL does not include test actions, we introduce them in the language as 
atomic actions in the same way as done in [10]. More precisely, we introduce 




Verifying Communicating Agents by Model Checking 



61 



an atomic action </>? for each proposition <j> we want to test. The test action <jp 
is executable in any state in which holds and it has no effect on the state. 
Therefore, we introduce the following laws which rule the modality [</>?]: 

□ (-'(/> ->• [</>?]T) 

□ (((/>?)T — > (L -o- [4>?}L)), for all fluent literals L. 

The first law is a precondition law, saying that action f>l is only executable in 
a state in which (f> holds. The second law describes the effects of the action on 
the state: the execution of the action (jp. leaves the state unchanged. We assume 
that, for all test actions occurring in a domain description, the corresponding 
action laws are implicitly added. 

As a difference from [10], in this paper we will use, besides boolean fluents, 
functional fluents, i.e. fluents which take a value in a (finite) set. We use the 
notation / = V to say that fluent / has value V. It is clear however, that func- 
tional fluents can be easily represented by making use of multiple (and mutually 
exclusive) boolean fluents. 

4 Contract Net Protocol 

In the social approach [7,13,19,22] an interaction protocol is specified by describ- 
ing the effects of communicative actions on the social state, and by specifying 
the permissions and the commitments that arise as a result of the current con- 
versation state. In our action theory the effects of communicative actions will 
be modelled by action laws. Permissions, which determine when an action can 
be taken by each agent, can be modelled by precondition laws. Commitment 
policies, which rule the dynamic of commitments, can be described by causal 
laws which establish the causal dependencies among fluents. The specification 
of a protocol can be further constrained through the addition of suitable tempo- 
ral formulas, and also the agents’ programs can be modelled, by making use of 
complex actions (regular programs). 

As a running example we will use the Contract Net protocol [6] . 

Example 1. The Contract Net protocol begins with an agent (the manager) 
broadcasting a task announcement (call for proposals) to other agents viewed as 
potential contractors (the participants). Each participant can reply by sending 
either a proposal or a refusal. The manager must send an accept or reject mes- 
sage to all those who sent a proposal. When a contractor receives an acceptance 
it is committed to perform the task. For lack of space we will leave out the final 
step of the protocol. 

Let us consider first the simplest case where we have only two agents: the 
manager (M) and the participant (P). The two agents share all the communica- 
tive actions, which are: cfp(T) (the manager issues a call for proposals for task 
T), accept and reject whose sender is the manager, and refuse and propose whose 
sender is the participant. 




62 



L. Giordano, A. Martelli, and C. Schwind 



The social state will contain the following domain specific fluents: task (a 
functional fluent whose value is the task which has been announced, or nil if 
the task has not yet been announced), replied (the participant has replied), 
proposal (the participant has sent a proposal) and acc.rej (the manager has sent 
an accept or reject message). Such fluents describe observable facts concerning 
the execution of the protocol. 

We also introduce special fluents to represent base-level commitments of the 
form C{i,j,a), meaning that agent i is committed to agent j to bring about 
a, where a is an arbitrary formula, or they can be conditional commitments of 
the form CC(i,j,P,a) (agent i is committed to agent j to bring about a, if 
the condition (3 is brought about) 2 . For modelling the Contract Net example we 
introduce the following commitments 

C(P, M, replied) and C (M , P, accjrej) 
and conditional commitments 

CC(P, M,task ^ nil, replied) and CC(M,P, proposal, accjrej). 

Some reasoning rules have to be defined for cancelling commitments when 
they have been fulfilled and for dealing with conditional commitments. We in- 
troduce the following causal laws: 

□ (O -> O ~‘C(i,j,a)) 

□ (0« -t O ~'CC(i,j,0, a)) 

°((CC(i,j,/3,a) A OP) -A- O (C(i,j,a) A ->CC{i,j,P,a))) 

A commitment (or a conditional commitment) to bring about a is cancelled 
when a holds, and a conditional commitment CC\i, j, p, a) becomes a base-level 
commitment C(i,j,a) when P has been brought about. 

Let us now describe the effects of communicative actions by the following 
action laws: 

U[cfp{T)]task = T 

□ [ cfp(T)\CC(M, P, proposal, accjrej) 

□ [ accept ] acccrej 

□ [reject] acccrej 

□ [refuse\replied 

□ [propose] ( replied A proposal) 

The laws for action cfp(T) add to the social state the information that a call 
for proposal has been done for the task T, and that, if the manager receives a 
proposal, it is committed to accept or reject it. 

The permissions to execute communicative actions in each state are deter- 
mined by social facts. We represent them by precondition laws. Preconditions 
on the execution of action accept can be expressed as: 

2 The two kinds of base-level and conditional commitments we allow are essentially 
those introduced in [22]. Such choice is different from the one in [13] and in [11], 
where agents are committed to execute an action rather than to achieve a condition. 




Verifying Communicating Agents by Model Checking 



63 



O^proposal V accjrej — > [accept]- L) 

meaning that action accept cannot be executed only if a proposal has not been 
done, or if the manager has already replied. Similarly we can give the precondition 
laws for the other actions: 

O^proposal V accjrej — > [reject]- L) 

□ (tasfc = nil V replied —> [re/use]_L) 

0(task = nil V replied —> [propose] _L) 

U(task ^ nil — > [c/p(T)]_ L). 

The precondition law for action propose ( ref use ) says that a proposal can only 
be done if task ^ nil, that is, if a task has already been announced and the 
participant has not already replied. The last law says that the manager cannot 
issue a new call for proposal if task ^ nil, that is, if a task has already been 
announced. 

In the following we will denote Pernii (permissions of agent i) the set of all 
the precondition laws of the protocol pertaining to the actions of which agent i 
is the sender. 

Assume now that we want the participant to be committed to reply to the 
task announcement. We can express it by adding the following conditional com- 
mitment to the initial state of the protocol: CC(P, M,task ^ nil, replied). Fur- 
thermore the manager is committed initially to issue a call for proposal for a 
task. We can define the initial state Init of the protocol as follows: 

{task = nil, -> replied , -> proposal , CC(P, M, task ^ nil, replied), 

C(M, P, task ^ nil)} 

In the following we will be interested in those execution of the protocol in 
which all commitments have been fulfilled. We can express the condition that 
the commitment C(i,j,a) will be fulfilled by the following constraint: 

a(C(i,j,a) Oa) 

We will call Conii the set of constraints of this kind for all commitments of agent 
i. Conii states that agent i will fulfill all the commitments of which he is the 
debtor. 

Given the above rules, the domain description D = (77, C) of a protocol is 
defined as follows: 77 is the set of the action and causal laws given above, and 
C = Init A /\j( Pernii A Com,: ) is the set containing the constraints on the initial 
state, the permissions Pernii and the commitments Conii of all the agents (the 
agents P and M, in this example). 

Given a domain description D, let the completed domain description 
Conip(D) be the set of formulas (Comp(II) A Init A /\ 4 ( Pernii A Com,)). The 
runs of the system according the protocol are the linear models of Comp(D) . 
Observe that in these protocol runs all permissions and commitments have been 
fulfilled. However, if Com.j is not included for some agent j, the runs may contain 
commitments which have not been fulfilled by j. 




64 



L. Giordano, A. Martelli, and C. Schwind 



5 Verification 

Given the DLTL specification of a protocol by a domain description, we describe 
the different kinds of verification problems which can be addressed. 

First, given an execution history describing the interactions of the agents, we 
want to verify the compliance of that execution to the protocol. This verification 
is carried out at runtime. We are given a history r = Oi, ... ,a n of the commu- 
nicative actions executed by the agents, and we want to verify that the history 
r is the prefix of a run of the protocol, that is, it respects the permissions and 
commitments of the protocol. This problem can be formalized by requiring that 
the formula 

(Comp(n) A Init A /\(Pernii A Com,;)) A < ai; 02; . . . ; a n > T 

i 

(where i ranges on all the agents involved in the protocol) is satisfiable. In fact, 
the above formula is satisfiable if it is possible to find a run of the protocol 
starting with the action sequence 01, . . . , a n . 

A second problem is that of proving a property tp of a protocol. This can be 
formulated as the validity of the formula 

(Comp(II) A Init A f\[Perrm A CWq)) — » p. ( 1 ) 

i 

Observe that, to prove the property </?, all the agents are assumed to be compliant 
with the protocol. 

A further problem is to verify that an agent is compliant with the protocol, 
given the program executed by the agent itself. In our formalism we can specify 
the behavior of an agent by making use of complex actions (regular programs). 
Consider for instance the following program irp for the participant: 

[->done?] (( cfp(T ); evaLtask ; [~<ok?] ref use ; exit + ok?; propose)) + 

[reject; exit) + ( accept ; do-task] exit))]*; done? 

The participant cycles and reacts to the messages received by the manager: 
for instance, if the manager has issued a call for proposal, the participant can 
either refuse or make a proposal according to his evaluation of the task; if the 
manager has accepted the proposal, the participant performs the task; and so 
on. 

The state of the agent is obtained by adding to the fluents of the protocol the 
following local fluents: done, which is initially false and is made true by action 
exit, and ok which says if the agent must make a bid or not. The local actions 
are eval-task, which evaluates the task and sets the fluent ok to true or false, 
do -task and exit. Furthermore, done? and ok? are test actions. 

The program of the contractor can be specified by a domain description 
Progp = ( IIp,Cp ), where Tip is a set of action laws describing the effects of the 
private actions of the contractor, for instance: 




Verifying Communicating Agents by Model Checking 



65 



0[exit}done 

0(task = tl [eval_task]ok) 

0(task = t.2 — > [eval-task\~>ok) 

and, Cp = {(np)T ,-<done,-<ok} contains the constraints on the initial values of 
fluents (-• done, ~>ok) as well as the formula (7r P )T stating that the program of 
the participant is executable in the initial state. 

We want now to prove that the participant is compliant with the protocol, 
i.e. that all executions of program np satisfy the specification of the protocol. 
This property cannot be proved by considering only the program ■np. In fact, 
it is easy to see that the correctness of the property depends on the behavior 
of the manager. For instance, if the manager begins with an accept action, the 
participant will execute the sequence of actions accept, do_ta.sk; exit and stop, 
which is not a correct execution of the protocol. Thus we have to take into 
account also the behavior of the manager. Since we don’t know its internal 
behavior, we will assume that the manager respects its public behavior, i.e. that 
it respects its permissions and commitments in the protocol specification. 

The verification that the participant is compliant with the protocol can be 
formalized as a validity check. Let D = ( II, C ) be the domain description de- 
scribing the protocol, as defined above. The formula 

(Comp(n) A Init A PermM A Cotom A Comp(IIp) A Cp) ( Permp A Comp) 

is valid if in all the behaviors of the system, in which the participant executes 
its program np and the manager (whose internal program is unknown) respects 
the protocol specification (in particular, its permissions and commitments), the 
permissions and commitment of the participant are also satisfied. 

6 Contract Net with N Participants 

Let us assume now that we have N potential contractors. The above formulation 
of the protocol can be extended by introducing a fluent replied(i), proposal(i) 
and acccrej(i) for each participant i, and similarly for the commitments. Fur- 
thermore we introduce the communicative actions refuse{i), and propose(i), 
which are sent from participant i to the manager, and reject(i) and accept{i), 
which are sent from the manager to participant i. We assume action cfp(T) to 
be shared by all agents (broadcast by the manager). 

The theory describing the new version of the protocol can be easily obtained 
from the one given above. For instance the precondition laws for accept(i) and 
reject(i) must be modified so that these actions will be executed only after all 
participants have replied to the manager, i.e.: 

n((^proposal(i) V accjrej(i) V V ;= i n -' r eplied(j)) — > [accept (i)}±.) 
and the same for reject(i). 

The verification problems mentioned before can be formulated using the same 
approach. For instance, the verification that the protocol satisfies a given prop- 
erty ip can be expressed as the validity of the formula (1) above, where i ranges 




66 



L. Giordano, A. Martelli, and C. Schwind 



over the N participants and the manager. To prove compliance of a participant 
with the protocol, we need to restrict the protocol to the actions and the fluents 
shared between the manager and this participant (i.e. we need to take the pro- 
jection of the protocol on agent i and the agents with whom i interacts). Then 
the problem can be formulated as in the case of a single participant. 

We have not considered here the formulation of the problem in which pro- 
posals must be submitted within a given deadline. This would require adding to 
the system a further agent dock. 

7 Model Checking 

The above verification and satisfiability problems can be solved by extending the 
standard approach for verification and model-checking of Linear Time Temporal 
Logic, based on the use of Biichi automata. As described in [15], the satisfiability 
problem for DLTL can be solved in deterministic exponential time, as for LTL, 
by constructing for each formula a £ DLTL(S) a Biichi automaton B a such 
that the language of w-words accepted by B a is non-empty if and only if a 
is satisfiable. Actually a stronger property holds, since there is a one to one 
correspondence between models of the formula and infinite words accepted by 
B a . The size of the automaton can be exponential in the size of a, while emptiness 
can be detected in a time linear in the size of the automaton. 

The validity of a formula a can be verified by constructing the Biichi au- 
tomaton B-, a for ->a: if the language accepted by B-, a is empty, then a is valid, 
whereas any infinite word accepted by B-, a provides a counterexample to the 
validity of a. 

For instance, let CN be the completed domain description of the Contract 
Net protocol, that is CN = (Cornp(II) A Init A /\ 4 ( Perm, A Corn,)). Then 
every infinite word accepted by Bcn corresponds to a possible execution of the 
protocol. To prove a property p of the protocol, we can build the automaton 
B-up and check that the language accepted by the product of Bcn and B-, v is 
empty. 

The construction given in [15] is highly inefficient since it requires to build 
an automaton with an exponential number of states, most of which will not be 
reachable from the initial state. A more efficient approach for constructing a 
Biichi automaton from a DLTL formula makes use of a tableau-based algorithm 
[9]. The construction of the states of the automaton is similar to the standard 
construction for LTL [8], but the possibility of indexing until formulas with 
regular programs puts stronger constraints on the fulfillment of until formulas 
than in LTL, requiring more complex acceptance conditions. The construction 
of the automaton can be done on-tlre-fly, while checking for the emptiness of the 
language accepted by the automaton. Given a formula p, the algorithm builds a 
graph G(p) whose nodes are labelled by sets of formulas. States and transitions 
of the Biichi automaton correspond to nodes and arcs of the graph. The algo- 
rithm makes use of an auxiliary tableau-based function which expands the set 
of formulas at each node. As for LTL, the number of states of the automaton is, 




Verifying Communicating Agents by Model Checking 



67 



in the worst case, exponential in the size if the input formula, but in practice it 
is much smaller. For instance, the automaton obtained from the Contract Net 
protocol has about 20 states. 

LTL is widely used to prove properties of (possibly concurrent) programs 
by means of model checking techniques. The property is represented as an LTL 
formula p, whereas the program generates a Kripke structure (the model), which 
directly corresponds to a Biichi automaton where all the states are accepting, 
and which describes all possible computations of the program. The property can 
be proved as before by taking the product of the model and of the automaton 
derived from -up, and by checking for emptiness of the accepted language. 

In principle, with DLTL we do not need to use model checking, because pro- 
grams and domain descriptions can be represented in the logic itself, as we have 
shown in the previous section. However representing everything as a logical for- 
mula can be rather inefficient from a computational point of view. In particular 
all formulas of the domain description are universally quantified, and this means 
that our algorithm will have to propagate them from each state to the next one, 
and to expand them with the tableau procedure at each step. 

Therefore we have adapted model checking to the proof of the formulas given 
in the previous section, as follows. Let us assume that the negation of a formula 
to be proved can be represented as F A ip, where F = Comp(II) A Init contains 
the completion of the action and causal laws in the domain description and the 
initial state, and p the rest of the formula. For instance, in the verification of 
the compliance of the participant, the negation of the formula to be proved is 
(Comp(n) A Init A PerniM A Coitim A Comp(Ilp) A -■(Permp A Comp)) and 
thus p = ( PermM A Cotom A Comp(IIp) A -i (Permp A Comp)). We can derive 
from F an automaton describing all possible computations, whose states are sets 
of fluents, which we consider as the model. In particular, we can obtain from 
the domain description a function trans a (S), for each action a, for transforming 
a state in the next one, and then build this automaton by repeatedly applying 
these functions starting from the initial state. We can then proceed by taking 
the product of the model and of the automaton derived from p, and by checking 
for emptiness of the accepted language. 

Note that, although this automaton has an exponential number of states, we 
can build it step by step by following the construction of the algorithm on-the- 
fly. The state of the product automaton will consist of two parts < Si , S'2 >, 
where Si is a set of fluents representing a state of the model, and S2 is a set of 
formulas. The initial state will be < I,<p >, where I is the initial set of fluents. 
A successor state through a transition a will be obtained as < trans a (Si), S ' 2 > 
where S ' 2 is derived from S'2 by the on-the-fly algorithm. If the two parts of a 
state are inconsistent, the state is discarded. 

8 Conclusions 

We have shown that DLTL is a suitable formalism for specifying and verifying a 
system of communicating agents. Our approach provides a unified framework for 




68 



L. Giordano, A. Martelli, and C. Schwind 



describing different aspects of multi-agent systems. Programs are expressed as 
regular expressions, (communicative) actions can be specified by means of action 
and precondition laws, properties of social facts can be specified by means of 
causal laws and constraints, and temporal properties can be expressed by means 
of the until operator. We have addressed several kinds of verification problems, 
including the problem of compliance of agents to the protocol, and described how 
they can be solved by developing automata-based model checking techniques for 
DLTL. A preliminary implementation of a model checker based on the algorithm 
in [9] is being tested in the verification of the properties of various protocols. 

The issue of developing semantics for agent communication languages has 
been examined in [20] , by considering in particular the problem of giving a veri- 
fiable semantics, i.e. a semantics grounded on the computational models. Guerin 
and Pitt [13,14] define an agent communication framework which gives agent 
communication a grounded declarative semantics. The framework introduces dif- 
ferent languages: a language for agent programming, a language for specifying 
agent communication and social facts, and a language for expressing temporal 
properties. Our approach instead provides a unified framework for describing 
multiagent systems using DLTL. 

While in this paper we follow a social approach to the specification and verifi- 
cation of systems of communicating agents, [4,3,16,21] have adopted a mentalistic 
approach. The goal of [3] is to extend model checking to make it applicable to 
multi-agent systems, where agents have BDI attitudes. This is achieved by using 
a new logic which is the composition of two logics, one formalizing temporal 
evolution and the other formalizing BDI attitudes. In [16,21] agents are writ- 
ten in MABLE, an imperative programming language, and have a mental state. 
MABLE systems may be augmented by the addition of formal claims about the 
system, expressed using a quantified, linear time temporal BDI logic. Instead 
[4] deals with programs written in Agent Speak (F), a variation of the BDI logic 
programming language AgentSpeak(L). Properties of MABLE or AgentSpeak 
programs can be verified by means of the SPIN model checker. These papers do 
not deal with the problem of proving properties of protocols. 

Yolum and Singh [22] developed a social approach to protocol specification 
and execution. In this approach, commitments are formalized in a variant of 
event calculus. By using an event calculus planner it is possible to determine 
execution paths that respect the protocol specification. Alberti et al. address 
a similar problem, by expressing protocols in a logic-based formalism based on 
Social Integrity Constraints. In [1] they present a system that, during the evo- 
lution of a society of agents, verifies the compliance of the agents’ behavior to 
the protocol. 

References 

1. M. Alberti, D. Daolio and P. Torroni. Specification and Verification of Agent 
Interaction Protocols in a Logic-based System. SAC’Of, March 2004. 

2. F. Bacchus and F. Kabanza. Planning for temporally extended goals, in Annals 
of Mathematics and AI, 22:5-27, 1998. 




Verifying Communicating Agents by Model Checking 



69 



3. M. Benerecetti, F. Giunchiglia and L. Serafini. Model Checking Multiagent Sys- 
tems. Journal of Logic and Computation. Special Issue on Computational Aspects 
of Multi-Agent Systems, 8(3):401-423. 1998. 

4. R. Bordini, M. Fisher, C. Pardavila and M. Wooldridge. Model Checking AgentS- 
peak. AAMAS 2003 , pp. 409-416, 2003. 

5. D. Calvanese, G. De Giacomo and M.Y.Vardi. Reasoning about Actions and Plan- 
ning in LTL Action Theories. In Proc. KR’02, 2002. 

6. FIPA Contract Net Interaction Protocol Specification, 2002. Available at 
http : //www.f ipa. org. 

7. N. Fornara and M. Colombetti. Defining Interaction Protocols using a 
Commitment-based Agent Communication Language. Proc. AAMAS’03, Mel- 
bourne, pp. 520-527, 2003. 

8. R. Gerth, D. Peled, M.Y.Vardi and P. Wolper. Simple On-the-fly Automatic ver- 
ification of Linear Temporal Logic. In Proc. 15th Work. Protocol Specification, 
Testing and Verification, Warsaw, June 1995, North Holland. 

9. L. Giordano and A. Martelli. On-the-fly Automata Construction for Dynamic 
Linear Time Temporal Logic. TIME Of, June 2004. 

10. L. Giordano, A. Martelli, and C. Schwind. Reasoning About Actions in Dynamic 
Linear Time Temporal Logic. In FAPR’00 - Int. Conf. on Pure and Applied Prac- 
tical Reasoning, London, September 2000. Also in The Logic Journal of the IGPL, 
Vol. 9, No. 2, pp. 289-303, March 2001. 

11. L. Giordano, A. Martelli, and C. Schwind. Specifying and Verifying Systems of 
Communicating Agents in a Temporal Action Logic. In Proc. AI*IA ’03, Pisa, pp. 
262-274, Springer LNAI 2829, September 2003. 

12. F. Giunchiglia and P. Traverso. Planning as Model Checking. In Proc. The 5th 
European Conf. on Planning (ECP’99), pp.1-20, Durham (UK), 1999. 

13. F. Guerin. Specifying Agent Communication Languages. PhD Thesis, Imperial 
College, London, April 2002. 

14. F. Guerin and J. Pitt. Verification and Compliance Testing. Communications in 
Multiagent Systems, Springer LNAI 2650, pp. 98-112, 2003. 

15. J.G. Henriksen and P.S. Thiagarajan. Dynamic Linear Time Temporal Logic, in 
Annals of Pure and Applied logic, vol. 96, n.1-3, pp. 187-207, 1999 

16. M.P. Huget and M. Wooldridge. Model Checking for ACL Compliance Verification. 
ACL 2003, Springer LNCS 2922, pp. 75-90, 2003. 

17. M.Pistore and P.Traverso. Planning as Model Checking for Extended Goals in 
Non-deterministic Domains. Proc. IJCAI’01, Seattle, pp. 479-484, 2001. 

18. R. Reiter. The frame problem in the situation calculus: a simple solution (some- 
times) and a completeness result for goal regression. In Artificial Intelligence and 
Mathematical Theory of Computation: Papers in Honor of John McCarthy, V. 
Lifschitz, ed., pages 359-380, Academic Press, 1991. 

19. M. P. Singh. A social semantics for Agent Communication Languages. In IJCAI-98 
Workshop on Agent Communication Languages, Springer, Berlin, 2000. 

20. M. Wooldridge. Semantic Issues in the Verification of Agent Communication Lan- 
guages. Autonomous Agents and Multi-Agent Systems, vol. 3, pp. 9-31, 2000. 

21. M. Wooldridge, M. Fisher, M.P. Huget and S. Parsons. Model Checking Multi- 
Agent Systems with MABLE. In AAMAS’02, pp. 952-959, Bologna, Italy, 2002. 

22. P. Yolum and M.P. Singh. Flexible Protocol Specification and Execution: Apply- 
ing Event Calculus Planning using Commitments. In AAMAS’02, pp. 527-534, 
Bologna, Italy, 2002. 




Qualitative Action Theory 

A Comparison of the Semantics of 
Alternating-Time Temporal Logic and the 
Kutschera-Belnap Approach to Agency 



Stefan Wolfl 

Institut fur Informatik, Albert-Ludwigs-Universitat Freiburg 
Georges-Kohler-Allee, 79110 Freiburg, Germany 

woelf l@inf ormat ik . uni-f re iburg . de 



Abstract. Qualitative action theory deals with purely qualitative de- 
scriptions and formal representations of agency, i.e., agents and their pos- 
sibilities for intervening in the causal flow of events. This means that, 
contrary to game theory, qualitative action theory abstains from any 
metric evaluation of the outcomes of actions. 

In this paper we present and compare two qualitative approaches to ac- 
tion theory that have been discussed in the literature. The first one com- 
ing from philosophical action theory is the Kutschera-Belnap approach, 
which is the semantic basis of so-called Stit-logics. The second approach 
is the semantics of Alur, Henzinger, and Kupferman’s Alternating-time 
Temporal Logic (ATL). In computer science, ATL has been introduced 
as an extension of Computational Tree Logic (CTL) to allow for model- 
ing systems that interact with their environment. Surprisingly, although 
both approaches are very close in spirit, a systematic analysis of the 
mutual dependencies between these approaches does not exist. 

The paper aims at bringing together these two research streams, which 
seem to have been developed independently in philosophy and computer 
science. In particular, we will investigate the assumptions with which 
both approaches may be considered equivalent. Finally, further research 
on this topic promises interesting results that translate between the ap- 
proaches presented here. 



1 Introduction 

Qualitative action theory deals with purely qualitative descriptions and formal 
representations of agency, i.e., agents and their possibilities for intervening in the 
causal flow of events. Qualitative theories of agency are typically situated in a 
setting that is well-known in game-theory: there are agents (or players) and each 
agent has choices concerning how to act (possible moves in the play), where the 
set of choices an agent has may depend on the current state. Contrary to game 
theory, however, qualitative action theory abstains from any metric evaluation 
of the outcomes of the actions. This means, in particular, that qualitative action 
theory does not aim at a theory of how to act rationally in a specific situation, 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, pp. 70-81, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




Qualitative Action Theory 



71 



but rather restricts consideration to purely descriptive questions such as: what 
possibilities does an agent have, and how could the world look like, when all the 
agents behave in a certain manner. 

Qualitative action theory (in the narrow sense) restricts consideration to con- 
cepts that are definable by causal notions and by terms describing the possible 
choices of agents. Examples of such core concepts include the concept of action , 
the concept of bringing -it- about- that , and the concept of strategy. In the wider 
sense qualitative action theory also takes into account concepts referring to dox- 
astic and/or voluntative aspects of agency, such as beliefs, intentions, reasons, 
goals, and aims. 

In this paper we focus on two qualitative (core) approaches to action theory 
discussed in the literature. The first one, coming from philosophical action the- 
ory, is the Kutsclrera-Belnap approach to agency. Historically, the first approach 
using game-theoretical notions for analyses in philosophical action theory was 
presented by Lennart Aqvist [3]. Based on Aqvist’s and Georg H. von Wright’s 
work on agency and Roman Ingarden’s work on causality, Franz von Kutschera 
developed in the 1980s an approach to agency that took the idea seriously that 
agency is only explainable in the context of an indeterministic theory (cf. [14]). 
Kutschera presented formal models for representing agency, and he also devel- 
oped a semantics for action logics. Further developments were contributed by, 
among others, Nuel D. Belnap, Brian F. Chellas, John F. Horty, and Michael 
Perloff. In particular, Belnap contributed many philosophical investigations re- 
garding an indeterminist view of agency. He developed a formal semantics that 
allows for modeling assertions (as speech acts), promisings, and moral obliga- 
tions. 1 Based on Kutschera and Belnap’s semantics, Ming Xu discussed axiom- 
atizations of action-theoretical concepts. In the literature these logical systems 
are usually referred to as stit-logics. 2 

The second approach discussed in this paper is the semantics of Alur, Hen- 
zinger, and Kupferman’s Alternating-time Temporal Logic (ATL) [1]. In com- 
puter science ATL has been introduced as an extension of Computational Tree 
Logics (CTL) to allow for modeling systems that interact with their environ- 
ment. In this context it is worth mentioning that there is a close connection 
between ATL and M. Pauly’s Coalition Game Logic (CGL) [15,16], as has been 
pointed out by Valentin Goranko [11]. In particular, ATL, CGL, and game the- 
ory share the assumption of a discrete flow of time, while the Kutschera-Belnap 
approach (in the sequel abbreviated by KB-approach) also allows for dense or 
continuous flows. 

Surprisingly, although the Kutschera-Belnap approach and the ATL ap- 
proach are very close in spirit, a systematic analysis of the mutual dependencies 
between these approaches does not exist. This paper aims at bringing together 
these two research streams, which seem to have developed independently in plri- 

1 For discussions of the Kutschera-Belnap approach and for explications of action- 
theoretical notions in this approach see [4,5], [10], [6,7,8], [12], and [13,14]. A com- 
prising presentation of the current state of discussion may be found in [9] . 

2 Cf. [19,20,21,22,23,24] as well as [17]. 




72 



s. warn 



losophy and computer science. In particular, we will investigate the assumptions 
with which both approaches may be considered equivalent theories. Finally, fur- 
ther research on this topic promises interesting results that translate between 
the semantic approaches presented here and the logics defined in terms of these 
semantics respectively. 



2 The Kutschera-Belnap Approach 

The basic idea of the Kutschera-Belnap approach can be briefly sketched in the 
slogan: ‘No agency without real choices’. This means that in order to ascribe 
agency to agents, we must ascribe to them genuine choices for how to act, i.e., 
choices by which agents can influence the causal flow of events. These choices 
are genuine in that each agent must be able to refrain from what s/he is actually 
doing. In particular, each agent can realize one of her/his choices independently 
of what the other agents do at the same moment. Thus, the Kutschera-Belnap 
approach implicitly assumes that the causal flow of events is not causally deter- 
mined: If any event were causally determined (by previous events and/or previous 
circumstances), we would never be able to ascribe genuine choices to agents. 

This indeterministic point of view, then, is modeled by tree-like formal 
structures, i.e., by structures consisting of a set of nodes (called moments ) 
and a binary relation defined on this set, which represents the relation of 
being-causally-earlier-than. This relation allows for branching with respect to 
the future, but not with respect to the past. A (full) branch of such a ‘tree’ 
is called possible history and represents one of the many possible courses the 
world might take. The idea is that the future is causally open (in the sense 
that it is not causally determined by the present and the past), while the past 
is causally closed (events that occurred in the past are settled, they cannot 
be made undone). By acting, persons can influence the future, but not the 
present or the past. But whether an agent can do something or not depends on 
current circumstances, and these are subject to changes in time. Thus, it may 
occur that an agent can do something now, but that s/he can not at some later 
moment. To represent these intuitions within the basic tree-like models, one 
assigns to each agent at each moment a set of (possible) choices such that each 
choice is consistent with the choices of all the other agents. 

These ideas are captured by the following formal definitions. 

Definition 2.1 (Tree). A tree is an ordered pair B = (Mom,^) consisting 
of a non-void set Mom (the set of moments) and an irrcflexive, transitive, and 
linear-to-the-left relation -< on Mom (the relation of earlier-than) . A maximal 
-<!-chain is said to be a history in B, and the set of all histories of B is denoted 
by His. For each moment m £ Mom, let 

His (to) := { h £ His : m £ h } 

denote the set of histories that pass through moment m. Histories h and h' are 
said to be undivided at moment m, h-L m h' , if there exists a moment m' £ hC\h' 




Qualitative Action Theory 



73 



with to A to/. If h and h! pass through to and if there does not exist any 
m' € hfi h' with to -< to', then h and h! are said to split at to. 



Definition 2.2 (Agent Tree). An agent tree is a triple C = (. B , Ag, Ch), where 
B is a tree, Ag is a non-void set of agents, and Ch is a map that assigns to each 
agent a £ Ag and each moment in of B a partition Ch Q (to) of His (to) such that 
the following conditions are satisfied: 

(a) If h £ X £ Ch a (m) and if h and h! are undivided at to, then h’ too is in X. 

(b) Let to. be a moment of B and suppose that X is a map that assigns to each 
agent a an element x( a ) € Ch a (m). Then there exists a history h that is 
contained in each x(a/, he., 



Pi X( a ) ^ 0 ' 

aeAg 



The elements of Ch a (m) are said to be the (momentary) choices of agent a 
at moment m, and Ch a (?n) is said to be the choice set of a at to. An agent a 
has non-vacuous choice at moment to. if Ch Q (m.) ^ {His (to)}, i.e., if a has at 
least two choices at to. 

By saying that each agent’s choice set forms a partition, we postulate that 
at each moment each agent chooses exactly one of her/his alternatives. Con- 
dition (a) of definition 2.2 means that an agent cannot separate histories that 
are undivided. Finally, by condition (b), each agent can choose an alternative in 
her/his choice set independently of the alternatives chosen by all the other agents 
(at the same moment). In particular, at a given moment in, no agent can prevent 
another agent from choosing any of her/his alternatives (at that moment). 

Given a moment in and a history h £ His (in), let ch a (m, h) denote the unique 
element of Ch a (m) that contains h. This means that ch a (m, h) is the choice agent 
a takes in history h at moment to. An agent tree is said to be agent-complete 
if, for all moments to. £ Mom and each pair of histories h, h' £ His (to), it holds: 

h' € f'l ch a (in,h) => hX m h'. 
aeAg 

The condition of agent completeness was first discussed by Franz von 
Kutschera [14]. It may be read as: ‘No splitting of the tree without the in- 
volvement of at least one of the agents.’ 

3 Alternating-Time Temporal Logic 

Alternating-time temporal logic has been introduced to enrich the expressive 
power of computation tree logics (CTL) for model checking purposes. While 
CTL is considered a suitable representation for closed reactive systems, that is, 




74 



S. Wolff 



systems that are completely determined by their current state, ATL aims at open 
systems, that is, systems that allow for interaction with their environment. 

Thereto, Alur, Henzinger, and Kupferman [1] introduce the concept of al- 
ternating transition system, which extends the concept of transition system as 
discussed in CTL. The difference between CTL transition systems and alternat- 
ing transition systems is characterized as follows: ‘While in ordinary transitions 
systems, each transition corresponds to a possible step of the system, in alter- 
nating transition systems, each transition corresponds to a possible move in the 
game between the system and the environment’ [2]. 

Definition 3.1 (Alternating Transition Frame). An alternating transition 
frame (abbr. by ATF) is a triple T = (£, Q, S), where 

(a) A is a (non-void) set of agents, 

(b) Q is a (non-void) set of states, and 

(c) 5: Q x £ — » 2 2<5 is a map that assigns to each state q and each agent a 
a non-void set of choices, 5{q,a), i.e., each choice is a set of possible next 
states, 

such that for each state q and for each family (Q a )aGS of choices at q, i.e., 
Qa £ S(q, a), there exists exactly one state q* with 

q* £ H Qa ' 

In the sequel, the function e> will be referred to as the transition function. 

Some notations: Let q and q' be states of an ATF T and let a be an agent. 
State q' is said to be an a-successor of q if there exists a Qf £ S(q, a) with q' £ Q' . 
The set of all a-successors is denoted by Succ(g,a). State qf is a successor of q 
if, in state q, each agent a has a choice Q a containing of . The heuristics of this 
definition is that q' is a successor of q if and only if, in state q, all the agents of 
T can cooperate in such a way that q' becomes the next state. 

A (full) computation of T is an infinite sequence of states, A = (qi)i^n, where 
each q i+ i is a successor of qi. A finite computation (of length n) is an initial 
segment 7 = (q\, . . . , q n ) of a full computation. For a finite computation 7 let 
n 7 be the length of 7. A q-computation is a computation starting in state q. In 
the sequel, Apt] will denote the i - th state of A. The set of all (full) computations 
of T will be denoted by A? and the set of all finite computations by F-f. 

The following example (cf. [1]) may help to illustrate the notions just intro- 
duced. 

Example 3.2. Consider a system S with two processes a and (3. In each state 
of the system, process a determines the truth value of proposition x and likewise 
process [3 that of y. We will assume that the system is completely described by 
propositions x and y, i.e., Q = {q, q x , q y , q xy }, where q x denotes the state in 
which x is true in the system, but y is not, etc. The transition function of the 




Qualitative Action Theory 



75 



system is defined as follows: If S is in a state, in which x is false, a is free to leave 
the truth value of x unchanged or to change it to true. Otherwise, a leaves x 
unchanged. Similarly, if y is false, (3 can leave the value of y unchanged or make 
it true, and if y is true, /3 leaves the truth value of y unchanged. This transition 
function can be defined formally by: 



S{q,a) = {{q,q y },{q x ,q xy }} 
5(q x ,a) = {{q x ,q xy }} 

S(q y ,a) = {{q,q y },{q x ,q xy }} 

&( qxyi Oi) = {{fe,fey}} 



8(q,ft) = {{q,qx},{qy,q xy }} 
3{qx: ft) {{//> q x } i {q y ? q xy } } 

&(q y ->ft) = {{q y i q xy }} 
3(q xy ,ft) = {{qy>q X y}} 



4 From Alternating Transition Frames to Agent Trees 

In what follows we now investigate the conditions with which the semantics of 
ATL can be embedded into the Kutschera-Belnap approach. To start with, let 
T = ( E,Q,S ) be an ATF that satisfies the following two conditions: 

(d) For each agent a and each state q , S(q, a) is a partition of Succ(< 7 , a). 

(e) For each agent a and each state q , if q' is an a-successor of q, then q' is a 
/3-successor of q for each agent /3. 

What is the meaning of these conditions? First, condition (d) seems quite 
plausible when looking at concrete examples of alternating transition frames. 
For condition (e), let us assume that q' is an a-successor of q, but that there is 
an agent /3 such that q' is not a /3-successor of q. From this it follows that there 
does not exist any computation A with A[z] = q and A [i + 1] = q' for some i € N. 
But this means that we can withdraw q 1 from every Q £ 5(q,a) without loosing 
any information about the possible runs of the system. If we do that for every 
agent at the same time, we obtain an ATF that satisfies condition (e). 

For example, we could redefine the transition function of example 3.2 as 
follows: 

S(q,a) = {{q,q v },{q x ,qxy}} 5{q,P) = {{?, q x }, {q y , q xy }} 

S(q x ,a) = {{q x ,q xy }} S{q x ,f3) = {{q x },{q X y}} 

5(q y , a) = {{q y }, {qxy}} 3{q y > ft) = {{9y> fey}} 

^(fey) 1 ^) = {{fey}} A qxy -A = {{fey}} 

This new transition function does not loose any information carried by the old 
one, but it does satisfy conditions (d) and (e). 

Finally, from conditions (d) and (e) it follows that for each agent a and each 
state </, the choice set <5(g, a) is a partition of the set of g-successors. In what 
follows an ATF satisfying these two conditions will be referred to as a restricted 
ATF. 




76 



s. warn 



Let now T = (A, Q , 5 } be an arbitrary ATF. We define a tree B ^ by ‘un- 
winding’ T as follows: 

Mom^ := Fj: 

7 'y' 7 [i] = 7 '[*], for each i = 1, . . . , n 7 , 

and n 7 < ny. 

Lemma 4 . 1 . The ordered pair B^ = (Mom^, is a tree, which has the fol- 
lowing properties: 

(a) There exists a bijective map between the set of computations of T , Ayr, and 
the set of histories of B^ . 

(b) For each finite computation 7 £ F, there exists a bijective map between the 
set of computations with initial segment 7 and the set of histories of B^ that 
pass through ‘moment’ 7. 

Proof. First, it is quite obvious that is an irrcflexive, transitive, and linear- 
to-tlre-left relation. Second, if A is a (full) computation of F, then obviously 

h\ { (A[l] , . . . , A[n]) : ti£N} 

is a maximal -^-chain of B^ . Vice versa, let now He a history of B^ . Then 
h is a maximal -H- chain, i. e., a maximal subset of Ayr that is linearly ordered 
by -< :F . Let 7 = (qi,...,q n ) £ h be chosen arbitrarily. Then define A/Jl] := 
qi, . . . , A h[n] := q n . Since q n has at least one successor, (qi, . . . , q n ) cannot be 
the maximal element of h. Hence there exists a 7' = (q [, . . . , q' m ) £ h with 
7 7'. Extend \ h by setting A h [n + 1 ] := q ’ n+1 , . . . , X h [m] := q' m . By this 

step-wise construction, one finally obtains a full computation A^. 

It can readily be checked that the assignment h i-»- A h is the inverse of the 
mapping A 1 — > h\. From this both claims (a) and (b) follow immediately. □ 

Let now F = (A, Q, 6 } be a restricted ATF. For a finite computation 7, let 
A( 7) be the set of all full computations A that have 7 as initial segment. Define 

Ag^ := V 

X £ ChJ(7) :•€=> there exists a Q £ <5(7[n 7 ],a) such that 
X = { h\ : X £ A( 7) and A[n 7 + 1 ] £ Q}. 

Theorem 4 . 2 . For each restricted ATF F = (XJ,Q,S), the triple 

= Ag^,Ch^} 

is an agent-complete agent tree. 

Proof. From conditions (d) and (e) it follows that each 6 (q,a) is a partition of 
the set of successors of q. Let 7 be a finite computation of F . Then, by applying 
lemma 4 . 1 (b), we immediately verify that each 01^(7} is a partition of the set 
of histories of B T that pass through 7. Conditions (a) and (b) of definition 2.2 
are easy to check. Finally, C ^ is agent-complete, since for each family {Q a ) a ez 
with Q a £ S(q,a), there exists at most one state q* £ f) aeZ . Qa- □ 




Qualitative Action Theory 



77 



5 From Agent Trees to Alternating Transition Frames 

Following we will establish an embedding of the semantics of ATL into the 
KB-approach to agency. The first step in the definition of this embedding is 
to represent the ATL concept of state in the framework of the KB-approach. 
However, there does not exist any straight-forward way of defining the notion of 
state in terms of moments. 

To see this, let us assume that we aim at describing a system S with a 
state set Q. Each q £ Q, then, corresponds to a complete description of the 
system at some time-point. However, when we look at the tree whose branches 
are the possible computations of the system (as we did in the previous section) 
the information about possible states of the system has disappeared. Clearly, 
at each moment (in the sense defined above), the system is in a certain (total) 
state, but we are not able to identify moments that are in the same state. 3 

Definition 5.1 (State Tree). A state tree is an ordered triple B = 
(Mom, -< , Tot) consisting of a non- void set of moments, Mom, an irreflexive, 
transitive, and linear-to-the left relation on Mom, and a partition Tot of 
Mom. 

The elements of Tot are referred to as total states. We say that moments m 
and m ! are in the same (total) state if there exists a t £ Tot such that m and 
m! are contained in t. For each moment to, let t rn denote the unique total state 
that contains to . 4 It is worth noting that a total state may have different pasts, 
while a moment can only have exactly one past. 

In what is to follow, we will restrict consideration on discrete trees, more 
precisely, on trees where each history is order-isomorphic to the set of natural 
numbers. Such trees will be referred to as trees over N. In a tree over N each 
moment to has an immediate successor in each history h passing through moment 
to, which will be denoted by mj). 

Let B be a state tree, and let to be a moment of B. We define 

Tot* (to) := { t m * : h € His (to)}, 

the set of possible next total states at moment to. 

3 Here and in what is to follow we adopt the following terminology: States correspond 
to (maybe incomplete) momentary descriptions of a system, while total states cor- 
respond to complete momentary descriptions. Thus, states in the sense of the ATL 
semantics are total states in the sense of the terminology used here. 

4 The concept of state may be introduced in terms of total states as follows: A state 
is a subset of Mom that can be written as a union of total states, i. e., 

s £ Stat •<==>■ there is a r C Tot with s = (J r. 

Note that this enables us to speak about the inconsistent state, which is distinct 
from each total state. Furthermore, Tot and Stat are subsets of 2 Mom , and each total 
state is a state. 




78 



s. warn 



Definition 5.2. A state tree is said to be uniform if for each t £ Tot and all 
m, m! £ t. 



Tot* (to) = Tot* (to') . 

The underlying idea of uniformity is that the partitioning of moments into 
total states is respected by the successor relation, i. e., that if m and to' are in 
the same total state, then m and m' have the same possible next total states. 
From the point of view of the Kutschera-Belnap approach, uniformity seems a 
very restrictive condition. 

If T is an ATF, then we can use the defintions of section 4 to define a state 
tree B T = (Mom^, -<Y , Tot^) by 

7 - Y : < *=> 7K] = YK'l 

Tot^ := />/~. 



Note that there exists a bijective map between the state set of T, Q, and the 
set TotY 

Lemma 5.3. Let T be an ATF. Then the tree B ^ is uniform. □ 

Definition 5.4 (Agent state tree). An agent state tree is a triple C = 
(B, Ag,Clr), where B is a state tree, Ag is a set of agents, and Clr is a choice 
map as specified in definition 2.2. 

Obviously, Ch Q (m) induces a partition of the set of successor moments of 
m, (to)) : to £ h}. But, as can be seen from simple examples, Ch Q (m) does 
not induce a partition of Tot* (to). Therefore, we need to extend the uniformity 
condition of the previous paragraph. Define 

Tot* (to, X) := { t m * : m. £ h £ X} = { t m . : h £ His (to) 0 X} 

where X £ Ch a (m), i.e., Tot* (to, A) is the set of possible next total states in 
case that a chooses X at moment to. 

Definition 5.5. An agent state tree over N is said to be uniform if the choice 
map Ch respects uniformity, i. e., if for each agent a, each total state t, and each 
pair of moments ?n, m! £ t, 

{ Tot* (to, X) : X £ Ch Q (m) } = { Tot* (to', X') : X' £ Ch Q (m'> } . 

Note that if an agent state tree is uniform, its underlying state tree is so, 
too. This follows from the fact that Tot* (to) — UxeCh„(m> Tot a (to, X). 



Lemma 5.6. Let T be a restricted ATF. Then the agent state tree C ^ (as de- 
fined in this and the previous section) is uniform. □ 




Qualitative Action Theory 



79 



Let now C be a uniform agent state tree over N. We set 

r c := Ag 

Q c := Tot 

6 c (t,a) := { Tot* (m,X) : X £ Ch a (m) } 

where m is an arbitrarily fixed element of t. This definition is well defined since 
the tree C is uniform. Note that Tot* (m, X) is a set of total states. Hence 
Tot* (m, X) £ 2 Tot , and thus 5 c (t, a) £ 2 2T °\ 

We are now ready to state our second theorem: 

Theorem 5.7. Let C be an agent-complete and uniform agent state tree over 
N. Then 



T c = (E c ,Q e ,6 c ) 

is a restricted alternating transition frame. 

Proof. There is almost nothing left to be proven. Let t £ Q c be a (total) state, 
and let (Q a )aez c be a family with Q a £ S c (t,a). Choose an arbitrary m £ t. 
Then, for each Q a , there exists a x(a) £ Ch Q (m) with Q a = Tot* (?n, x(a)). 
By applying condition 2.2, there exists a history h that is contained in each 
X(a). Since the tree is agent-complete, h is uniquely determined up to undivided 
histories. This means that m* h , = m )), for each history h' £ r\ a x( a )- Since the 
5 c (t,a) do not depend on the particular choice of m in t , there exists exactly 
one total state that is contained in each Q a , namely f m *. That the frame T c 
is restricted follows from the fact that each Clr Q (m) is a partition of the set of 
histories that pass through moment m. □ 

Finally, if an agent state tree C, as specified in the theorem, is a ‘forest’ such 
that each total state is realized in exactly one of its root moments, then there 
exists a bijection between the set of histories of C and the set of computations 
of T c . 

6 Summary and Outlook 

In this paper we focused on two approaches to qualitative action theory, the 
Kutschera-Belnap approach and the semantics of alternating-time temporal 
logic. Though at first glance both approaches are very close in spirit, they could 
not be found to be equivalent without modifying the basic semantics respec- 
tively. If reasonable conditions on alternating transition frames are enforced, 
these frames can be shown to induce agent trees. Vice versa, agent trees do in- 
duce alternating transition frames if they are enriched with the notion of state 
and if some uniformity conditions are assumed. However, from the point of view 
of the Kutschera-Belnap approach, these uniformity constraints seem very spe- 
cial. The best interpretation of them is to read ATL choices (i. e., the elements of 




80 



S. Wolfl 



the agent-dependent transition function) as action types in the following sense: 
Each agent has a repertoire of ‘procedures’ that can be performed when the 
system is in a particular state. Whether this procedure can be performed de- 
pends on the current state only and not on one of the many possible pasts the 
system might have passed through to reach this state. Contrary to this, choices 
in the Kutsclrera-Belnap approach are assigned with respect to moments, i. e. , 
with respect to the current state and one particular past of that state. 

It is also worthwhile to note that in ATL the notion of strategy is defined in a 
more KB-like manner, i. e., strategies are not defined with respect to single states 
only, but are defined with respect to finite computations. More precisely, in the 
KB-approaclr a (strict) strategy of an agent a is a partial function a that has as 
its domain a non- void convex subset of Mom, dom (a), and that assigns to each 
m £ dom (a), a choice a(m ) £ Ch a (m). In the ATL-approaclr a strategy of a is 
a map that assigns to each finite computation 7 a choice Q £ S(y [n 7 ],a). The 
close relationship between these two notions of strategy should now be obvious. 

The results presented in this paper provide the start point of an interesting 
research topic: What are the connections between the logics that are defined 
with respect to the semantic concepts presented here? But an answer to this 
question would go far beyond the scope of this paper. 

Acknowledgement. This work has been partially supported by the Deutsche 
Forschungsgemeinschaft. (DFG). I am grateful for helpful discussions with 
Valentin Goranko and Alberto Zanardo. 

References 

1. Alur, R., Henzinger, T.A., Kupferman, O.: Alternating-time temporal logic. In: 
Proceedings of the 38th Symposium on Foundations of Computer Science. (1997) 

2. Alur, R., Henzinger, T.A., Kupferman, O.: Alternating-time temporal logic. Jour- 
nal of the ACM (JACM) 49 (2002) 672-713 

3. Aqvist, L.: A new approach to the logical theory of actions and causality. In 
Stenlund, S., ed.: Logical Theory and Semantic Analysis: Essays Dedicated to Stig 
Kanger on His Fiftieth Birthday. D. Reidel, Dordrecht (1974) 73-91 

4. Belnap, N.D.: Backwards and forwards in the modal logic of agency. Philosophy 
and Phenomenological Research 51 (1991) 777-807 

5. Belnap, N.D.: Before refraining: Concepts for agency. Erkenntnis 34 (1991) 137— 
169 

6. Belnap, N.D., Perloff, M.: Seeing to it that: A canonical form for agentives. In 
Kyburg, Jr., H.E., Loui, R.P., Carlson, G.N., eds.: Knowledge Representation and 
Defeasible Reasoning. Kluwer Academic Publisher, Dordrecht (1990) 175-199 

7. Belnap, N.D., Perloff, M.: The way of the agent. Studia Logica 51 (1992) 463-484 

8. Belnap, N., Perloff, M.: In the realm of agents. Annals of Mathematics and 

Artificial Intelligence 9 (1993) 25-48 

9. Belnap, N.D., Perloff, M., Xu, M.: Facing the Future: Agents and their Choices in 
our Indeterminist World. Oxford University Press, New York (2001) 

10. Chellas, B.F.: Time and modality in the logic of agency. Studia Logica 51 (1992) 
485-517 




Qualitative Action Theory 



81 



11. Goranko, V.: Coalition games and alternating temporal logic. In: Proceedings of 
the 8th Conference on Theoretical Aspects of Rationality and Knowledge (TARK 
VIII), Morgan Kaufmann (2001) 259-272 

12. Horty, J.F., Belnap, N.D.: The deliberative stit: A study of action, omission, ability, 
and obligation. Journal of Philosophical Logic 24 (1995) 583-644 

13. Kutschera, F.v.: Grundbegriffe der Handlungslogik. In Lenk, H., ed.: Handlungs- 
theorien interdisziplinar I. Fink, Mrinchen (1980) 67-106 

14. Kutschera, F.v.: Bewirken. Erkenntnis 24 (1986) 253-281 

15. Pauly, M.: A modal logic for coalition power in games. Journal of Logic and 
Computation (2000) 

16. Pauly, M.: A logical framework for coalitional effectivity in dynamic procedures. 
In: Proceedings in the 4th Conference on Logic and the Foundations of Game and 
Decision Theory (LOFT4), Torino (2000) 

17. Wolfl, S.: Propositional Q-logic. Journal of Philosophical Logic 31 (2002) 387-414 

18. Wolfl, S.: Review of Nuel Belnap, Michael Perloff, and Ming Xu’s Facing the Future. 
Notre Dame Philosophical Reviews (2002). Published at: http://ndpr/icaap.org/ 
content / archives /2002 / 8 / wolfl-belnap.html. 

19. Xu, M.: Decidability of stit theory with a single agent and refref equivalence. 
Studia Logica 53 (1994) 259-298 

20. Xu, M.: Doing and refraining from refraining. Journal of Philosophical Logic 23 
(1994) 621-632 

21. Xu, M.: Decidability of deliberative stit theories with multiple agents. In Gabbay, 
D.M., Ohlbach, H.J., eds.: Temporal Logic, Berlin, Springer (1994) 332-348 Lecture 
Notes in Artificial Intelligence: 827. 

22. Xu, M.: Busy choice sequences, refraining formulas, and modalities. Studia Logica 
54 (1995) 267-301 

23. Xu, M.: On the basic logic of stit with a single agent. Journal of Symbolic Logic 
60 (1995) 459-483 

24. Xu, M.: Axioms for deliberative stit. Journal of Philosophical Logic 27 (1998) 
505-552 




Practical Reasoning for Uncertain Agents 



Nivea de C. Ferreira, Michael Fisher, and Wiebe van der Hoek 

University of Liverpool, Department of Computer Science, UK 
{niveacf, michael, wiebe}@csc.liv.ac.uk 



Abstract. Logical formalisation of agent behaviour is desirable, not 
only in order to provide a clear semantics of agent-based systems, but 
also to provide the foundation for sophisticated reasoning techniques to 
be used on, and by, the agents themselves. The possible worlds semantics 
offered by modal logic has proved to be a successful framework in which 
to model mental attitudes of agents such as beliefs, desires and inten- 
tions. The most popular choices for modeling the informational attitudes 
involves annotating the agent with an 5'5-like logic for knowledge, or a 
KD45 - like logic for belief. However, using these logics in their standard 
form, an agent cannot distinguish situations in which the evidence for 
a certain fact is ‘equally distributed’ over its alternatives, from situa- 
tions in which there is only one, almost negligible, counterexample to 
the ‘fact’. Probabilistic modal logics are a way to address this, but they 
easily end up being both computationally and conceptually complex, for 
example often lacking the property of compactness. In this paper, we 
propose a probabilistic modal logic PfKD45 , in which the probabilities 
of the possible worlds range over a finite domain of values, while still al- 
lowing the agent to reason about infinitely many options. In this way, the 
logic remains compact, implying that the agent still has to consider only 
finitely many possibilities for probability distributions during a reason- 
ing task. We demonstrate a sound, compact and complete axiomatisation 
for PfKD45 and show that it has several appealing features. Then, we 
discuss an implemented decision procedure for the logic, and provide a 
small example. Finally we show that, rather than specifying them be- 
forehand, the finite set of possible probabilities can be obtained directly 
from the problem specification. 



1 Introduction 

In both reasoning about agents and in reasoning within agents, it is vital to 
choose tools that allow the representation of information at an appropriate level 
of abstraction, yet being simple enough to be mechanised. Logical formalisa- 
tions of such informational aspects have been particularly successful, often using 
modal logics such as S5 for knowledge, or KD45 for belief. However, it is clear 
that, in realistic scenarios, such descriptions need to incorporate uncertainty. 
Without such descriptive flexibility, logical approaches cannot effectively repre- 
sent real-world concerns and so cannot be used as the basis for practical reasoning 
in agents acting with uncertain information. While there have been some steps 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, pp. 82-94, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




Practical Reasoning for Uncertain Agents 



83 



in developing logics of uncertainty or logics of probability (see Section 6) many 
of these (for example, probabilistic modal logics) are both computationally and 
conceptually complex. In particular, a significant drawback is that many such 
approaches lack the property of compactness 1 . 

In this paper we present a new probabilistic modal logic (called PfKD45) 
that builds upon the natural framework of Kripke models (the basis of modal 
logics), while allowing reasoning about uncertainty. Importantly, in this logic, 
the probabilities of the possible worlds range over a finite domain of values, 
while still allowing the agent to reason about infinitely many options. In this 
way, the logic remains compact, ensuring that the agent only has to consider 
finitely many possibilities for probability distributions during a reasoning task. 

The PpKDf5 logic extends, in some aspects, the system PpD previously 
introduced in [9], which in turn was inspired by the system from [3]. The basic 
modal operator P > allows us to write formulas such as Pq 5 <P, meaning that 
the “agent believes ip with probability strictly greater than 0.5”. The operators 
(which have self-explanatory meaning) P - , P K , P- and P = can then be defined 
in terms of the above basis. Since probabilities range from 0 to 1, Pp corresponds 
to the modal operator □ or B. An important property of the logic is that it only 
allows probability measures (for each world) that are within a finite base set F. 
Although this semantically restricts probability assignments to a finite range, it 
is still possible to express and reason about arbitrary probabilities, since there is 
no restriction in the language that mirrors this semantic restriction. But again, 
in the logic, a particular axiom (Axiom A7 ; see later) ensures that arbitrary 
values collapse to values in the set F. The main motivation for using F is the 
restoration of compactness for the logic. 

Logics that allow us to express that Prob(ip) ~ x are, in general, not compact. 
Witness the set of premises P 

{Prob(q) > a \ a € Q fl [0, 1)} (1) 

Here, we have P f= Prob(q) = 1, and yet there is no finite subset of P that 
proves this conclusion. This has a computational counterpart: a mechanical de- 
vice verifying whether a set of premises {Prob(ip) ~ x} is satisfiable in Q fl [0, 1] 
in principle has to check an infinite number of assignments of probabilities to 
formulas ip. The advantage of the PpKDf5 logic is that the range of allowed 
probabilities is within a finite base set PC [0,1]. 

Although the use of the base set F causes a logical restriction, it is possible 
to highlight some interesting aspects (cf.[9]). For instance, it we take F = {0, 1}, 
we have classical modal logic. Alternatively, Driankov’s linguistic estimates (as 
in [2]) impossible, extremely unlikely, very low chance, small chance, it may, 
meaningful chance, most likely, extremely likely, certain would be modelled by a 
9-element P. In other words, the granularity of P can be chosen according to the 
intended application of the agent. However, since one of our main interests is to 
use the PpKDf5 logic for describing and implementing uncertain agents, then 

1 Compactness in the sense that inference in terms of infinite sets coincides with 
inference over finite sets. 




84 



N. de C. Ferreira, M. Fisher, and W. van der Hoek 



having a mechanism for directly calculating the set F is very desirable. For this 
purpose, the basic idea is to have F determined by a set of arbitrary probability 
values which are directly extracted from the original agent specification. (Further 
discussion concerning this aspect will be provided in Section 6.) 

In summary, contrary to many other logical approaches to probabilistic rea- 
soning, our logic is both compact and conceptually simple. Thus, it represents a 
strong candidate for representing and reasoning about uncertainty within com- 
putational agents. 

The paper is organised as follows. In Section 2 we present a description of the 
language and, in Section 3, we provide its semantics and establish its properties. 
Since the focus of PpD was not on a cloxastic interpretation of modalities, we 
also include two additional properties in the PpKDf5 Logic (axioms A8 and 
A9; see later), in order to represent KDf5-\\ke belief. For instance, this allows 
us to have a probability distribution independent of worlds, and thus ensure that 
nested belief formulas are equivalent to formulas without nesting. Such issues 
are considered in more detail in Section 3. A decision procedure for the logic 
has been developed and implemented, and this is presented in Section 4. Due 
to space restrictions, only a small motivating example showing the versatility of 
the approach is provided in Section 5. Finally, related work and final remarks 
are presented in Section 6. 

2 Language Description 

The language L of PpKD45 consists of a countable set of propositional symbols, 
the logical connectives -i and V (with standard definitions for _L, T, A, — -O-), 
and parentheses. We also define a modal operator P>, where £ is a real number 
within the interval [0, 1]. 

Definition 1. A set F is a base for a logic PpKDf5 if it satisfies: 

1. F is finite; 

2. {0,1} c PC [0,1]; 

3. x,y € F and (x + y < 1) => (x + y) £ F; 
f. x e F => (1 - x) e F. 

The logic is defined relative to a fixed base set F = {x$,x\, ...,x n } C [0,1]. It 
is assumed that Xi < Xi + 1 , if i < n (implying 0 = and x n = 1). The basic 
operator is Pff , with intended meaning of Pff p being: “p is believed to have a 

probability strictly greater than x” . 

The following abbreviations are used (from now on, x and y represent arbitrary 
values over [0,1], and Xi,Xi + i are elements of the base set F.): 

Dl. Pip = 

D2. Pip = Pi_ x ^p 
D3. Pip = -> Pi_ x ~<p 
D4, Ppp = ^P-i p A -'Pip 




Practical Reasoning for Uncertain Agents 



85 



The inference rules (Rl and R2) and axioms (A1-A9) of PpKD^5 are: 

R1 From p and ip => ip infer ip (modus ponens) 

R2 From p infer Ppp (necessitation rule) 

A1 All propositional tautologies 
A2 Pf (p^ ip) -A [(P> p P> ip) A (P> p Pffip) A 

A3 Pp(p -> tp) -A (Pjr^ P>%p) 

A4 P 0 ^ 

"45 P> +!/ (pVip)^- (P> V P> xp) 

A6 Pp->(p Aip) —> ((P>p A P^V>) -> -Pf+yO V V)) 

P xiV ^ P^k+iV 

A8 (P> P| ^ -» P| p) A (P> P| p -> P| p) 

A9 (Pj> ^ P^ P| <p) A (Pffp — t if P| p) 

The axioms A1-A6 all reflect basic properties of probabilities. Axiom A 7 reflects 
the peculiarity of having a base set F : it says that, if a probability is bigger than 
a certain value in F , it must be at least the next value. Axioms A8 and A9 
are included to emphasize the relationship with the modal logic KD^5 and they 
make our agents doxastically introspective. The intuition behind these additional 
axioms is as follows. Axiom A8 denotes that, if the agent assigns a positive 
probability to some probabilistic judgement, then it incorporates this judgement. 
Axiom A9 states that the agent is absolutely sure about its own probabilistic 
beliefs (the focus of [9] was not on a doxastic interpretation of the modalities, 
and these introspective properties were not included). 

Lemma 1. The following theorems are derivable in PpKDJ^5: 
b Ppp and b Ppp = P^p 



(P^p P^)] 

(where y < x) 

(where x + y £ [0, 1]) 
(where x + y € [0, 1]) 



Remark 1. We can define a belief operator, 'B\ using Bp = Ppp, and can then 
infer the following. 

a) 1. b p =>b Bp 

2. b B( p — > ip) — > (Bp — > Bip) 

3. b ~^B± 

4. Bp — > BBp 

5. —>Bp — > B—>Bp 

b) We say that a formula in L is modal if it is built from atomic propositions, 
using only the logical connectives and the modal operator B. We claim that 
for all modal formulas, p , PpKDf5 b p iff KD45 b p. 

Proof. The •£= part follows from a above; the =t> part will be obvious from 
the semantics for PpKDf5 given later 2 . 

2 Due to space limitations, full proofs are generally omitted, but can found in the 
associated technical report [1]. 




N. de C. Ferreira, M. Fisher, and W. van der Hoek 



Below we present some further theorems of PpKDJ^5, though only give one 
example proof. We also utilise some additional notation: 

— (p V VO means ((p V VO A ->(p A ip)), i.e. exclusive OR; 

— x J [= min{y £ F \ y > x} and x\.= max{y £ F | y < x}. 

Now, for all p,tf> in the language and all x £ [0, 1]: 

Tl. (P^p o (P> p V P= p)) A (Pffp(P<p V P=p)) 

T2. P>p\/ P~p\/ P<p 
T3. ^(P-<pAP-ip) 

T4 - h p x <P ^ p ^r<p) A (~<P> ip O P x %) 

T 5. (Pp-p A P^p) 

T 6. P > p -> Pff p 
Tl. P-p ££ Pp_ x ^p 

The following lemma shows the benefit of having a finite base F: it guarantees 
that we can express in the language that every formula has a probability. 

Lemma 2. For all p £ L, the following is a PpKDJ^5 -theorem: 

P x 0 <P V PffVV V Ppy (recall: F = {0 = x 0 , x u ..., x n = l}) 



(' V ± x) 
y<x 



T 8. P>p -)• P~p V Pp i+1 T V - V Pp n T, 

T9. P<p -> P~p v P^VV - V P^T, 

T10. (P>p O P^p) A (P<p O P^) 

TH. [Pf-.(v? A ip) A P=y>] -A [P=V> ^ PT+j/Op V VO] 



with Xi = x t ; 
with x* = a; 4_; 
x e [0,1), y £ (0, 1] 
x,y,x + y £ [0, 1] 



T12 P> P~p -£ P~p, ~ is one of {<, <, =, >>} 

T13 P~p -> PpPffp, ~ is one of {<, <, =, >>} 



3 Semantics and Properties 

Formulas p £ L are interpreted on Probabilistic Kripke Models over F. 

Definition 2. For each base set, F, VpICV 45 is f/ie class of all models M = 
(W, Pp, 7r) for which: 

— W is a non-empty set (of worlds); 

— Pp is a function Pp : W —> F, satisfying Y(, w ew Pf ( w ) = 1 

— n is a valuation: W x L — » {true, false} 

— The truth definition for formulas is defined in a standard way, the modal 
clause reading: 

(M,w)\=P^(p)iff Z w > s .t . ( M,w')\=ip Pf(w ') > x 




Practical Reasoning for Uncertain Agents 



87 



Note that the probability distribution is independent of the world. Let us call 
such a structure 7V/CD45. 

One can relate this semantics to a more standard Kripke semantics as follows. 
Given M = (W,Pp, n), first choose an arbitrary world w in the model M. Then, 
let W' be {u>} U {w' \ Pp{w’) > 0}. Finally, define R'{x,y ) iff Pf{v) > 0, i.e., 
a world is accessible (from any world) if, and only if, its probability is positive. 
Let M' w = {W' , R', 7 r') be the model thus obtained, with tt' being the restriction 
of 7 r to W. The following gives a semantic motivation for coining our system 
P f KD45: 

Proposition 1. Given a PpKD45 model M = (W,Pf, 7 r) and a world w, let 
M' w = (W' , R' , n') be obtained as described above. Moreover, let a purely modal 
formula from PpKD45 be a formula in which all modal operators are Pp, or, 
equivalently, B. Then: 

1. for every purely modal formula ip, we have M,w f= ip iff M' w |= ip; 

2. the accessibility relation R' is serial, transitive and Euclidean. 



Lemma 3. PpKD45is sound with respect to VfICD45, i. e., PpKD45 b <p => 
VpKV 45 |= ip. 



3.1 Completeness 

Let ip be a consistent formula of PpKD45. We will show how to construct a 
model that satisfies ip. Let F be the set of sub-formulas of ip closed under a 
single negation and satisfying, for any ~ within {<,>,<,>,=}, (P~ € F =>■ 
{Pff.\xi € F} C F). With F being finite, say |<?| = k, we can define the F- 
maximal consistent sets as Pi, P2, ..., P„, n < 2 k . Let 7 j be the conjunction of 
formulas in P,:,« < n. Then, we have: 

i. b -1(7 , i A 7 j), where i ^ j\ 
ii. b (71 V ... V 7 n ) 

Hi. b if o 7^,1 V ... V 7^, x , where 7^,1 V ... V 7^,^ are exactly those 7’s which contain 
if as a conjunct, for each if> £ F. 

Since ip is consistent and, by construction of the P’s, there is at least one P^ 
such that ip £ r v . Given this P v , we construct a set T> D P v as follows. From 
Theorem T 8, we know that for every consistent set P and formula if, at least 
one set of the sequence 

p u {P^if}, r u {p-if }, . . . , p u {P=_^}, p u {P-rP} (2) 

is also consistent. Now, we obtain F> from P v as follows: 

1. let F 0 = r v (this set is consistent); 

2. for i = 1 to n, we know that there is some x £ F such that <p_i U {Pp7»} 
will be consistent, and we make the corresponding choice for F,. 




N. de C. Ferreira, M. Fisher, and W. van der Hoek 



We let <P be T> n -, this is a consistent extension of F v , which contains a probability 
in F for every ‘world’ P, (i < n). We are now ready to define our canonical model 
M c = ( W c , Pp , 7r c ) as follows: 

1. w c = {p v } u (p* I > 0P= 7i e <?}. 

2 . P£(p) = x ^ P=ii e $ 

3. 7r(Pj)(p) = true iff p £ P* 

Lemma 4 (Coincidence Lemma). Tor all if € F and P £ TT C 

M c ,r\=ip iff ii>er 

Theorem 1 (Soundness and Completeness, Finite Models). For any for- 
mula ip, we have VpICDAb j= ip iff PpKDf5 b ip. Moreover, every consistent 
formida has a finite model. 

3.2 Nested Beliefs 

Considering PpKDf5 as a language for representing properties within individual 
agents, we next show that nested belief formulas can be removed, i. e,. any nested 
belief formula is equivalent to some formula given without nesting. 

Lemma 5 (Independence of Probability Distribution). Let M = 

(W,Pf,tt) be a VplCD'i5 model. Then: 

3 w £ W(M, w) \= P}f3 44 Vu £ W, (M, u) [= P|/3 . 

We now demonstrate that nested beliefs are superfluous, in PpKDf5. This result 
is a generalisation of [10, Theorem 1.7. 6 . 4] , where it is proved for S5 , which means 
that their result still goes through when weakening the logic to KDf5, and even 
when having probabilistic operators. 

Definition 3. We say that a formida if is in normal form if it is a disjunction 
of conjunctions of the form 6 = u A Py[/3i A P^/3 2 A ... A P^/3« A Pffai A Pff 2 a 2 A 
... A P^Ofc, where u>, fa, aj, (i < n,j <k) are all purely propositional formulas. 
The formida S is called the canonical conjunction and the sub-formulas P^rfa 
and P^ctj are called prenex formulas. 



Lemma 6. If if is in normal form and contains a prenex formula a, then if 
may be supposed to have the form ir V (A A a) where 7r, A and a are in normal 
form. 

Proof, if is in normal form, so f) = hi V 62 V ... V S m , where Sf s are canonical 
conjunctions. Suppose a occurs in S m . Then er must be some conjunct Pff , so 
that 6 m can be written as (A A cr). Taking ir to be (<5i V S2 V ... V S m - 1 ) gives the 
desired result if = tt V (A A a). 




Practical Reasoning for Uncertain Agents 



89 



This lemma guarantees that prenex formulas can always be moved to the out- 
ermost level. 

Lemma 7 (Removal of Nested Beliefs). We have, in Pi?/CP45: 

P| (tt V (A A P|/3)) O (P| (tt V A) A P|/3) V (P|tt A -P|/3) (3) 

P| (tt V (A A P>/3)) O (P| (tt V A) A P>/3) V (P| tt A -P>/?) (4) 

Proof. We sketch the proof of (3). As (M , s) |= Pfffi V P^/3, there are two 
possible cases to consider. 

First Case. Assuming (M, s) |= Py/3 we aim to show that 

P| (tt V (A A P|/3)) o (P| (tt V A) A P|/3) (5) 

For — », note that ( 7 rV (A A Py/?)) -A (77 V A) is a tautology. Hence, the truth of 
Pa( ttV (AAP^/3)) in s implies that of P^(7rV A) in s (using A2). This, together 
with (M, s) (= P^/3 leads to 

(M, a) |= P| (tt V (A A Pf /?)) (P| (tt V A) A P|/3) (6) 

and this is valid for any state since (M, s) (= Py/3 •€=> Vu G S, ( M , w) |= P^/3. 

Concerning the converse, from P<j(7tVA) AP^/3 we have that both Pjr(7rVA) 
and Py/3 are true in all m £ S'. A is true. (Vu) (M, u) |= A •€=> A A P^/3 is also 
true. So, 

(M, a) |= (P| (tt V A) A P|/3) -A P| (tt V (A A P|/3)) (7) 

Then, at this point we have: 

(M, a) |= -P 7 -/3 (P| (tt V (A A P|/3)) o (P| (tt V A) A P|/3)) (8) 

The second case is analogous, giving 

(M, a) h -P|/3 -A (P| (tt V (A A P|/3)) o (P|tt A -P|/3)) (9) 

After considering the two cases we can, finally, use the propositional tautology 
[{p r))) A (-ip -A (q o (~>p A a)))] -» [(g ((r A p) V (s A ->p)))], 

together with (8) and (9) to conclude (3). 

We can this way bring all the probabilistic operators to the outermost level: 3 

Theorem 2. Every formula <p is equivalent to a formula if in normal form, 
i. e., a formula without nesting of probabilistic operators. 

3 This result seems parallel to a result [7, Theorem 3.1.30] about a language with 
quantifiers, which proof is given with induction on ip. 




90 



N. de C. Ferreira, M. Fisher, and W. van der Hoek 



4 Decision Procedure 

As previously explained, the semantic definition for PpKD45 -formulas is based 
on Probabilistic Kripke Models. For each world w there is a set of worlds that 
iv considers possible and each one of these possible worlds is specified according 
to the formulas it satisfies. For instance, if in the actual world w Ppp holds, 
the probability values assigned to the possible worlds where p is true sum up 
to 1 (which, in this case, guarantees that all the worlds where p is false have 
probability zero). 

In other words, by definition, the probability of a formula is given by the sum 
of the probability values assigned to the worlds that satisfy this formula, and 
satisfiability of a propositional formula is given by the assignment of truth- values 
to its symbols. So, by evaluating formulas, we identify the worlds where those 
formulas are satisfied. As a result, we can obtain the values that, once assigned 
to the set of possible worlds, can satisfy the modal formula present in the agent’s 
specification. 

The idea is to convert the set of formulas into constraint (in)equations. 
The inequation components represent all the possible truth valuations for the 
propositional symbols. A finite set of formulas is given, and a finite set of con- 
straint (in)equations will be generated; each formula is converted into a set of 
(in)equational statements. 

For instance, consider that the agent specification is expressed by the set of 
formulas: {P ( p H p, P^ 7 q} and F = {0, 0.1, 0.2, ..., 0.9, 1}. The four possible sets of 
worlds (characterised by the truth-assignments) are: plql (where both p and q 
hold), plqO (in which p holds and the negation of q holds), pOql (in which the q 
and negation of p hold) and pOqO (where both negations hold). 

In the given example, the set of constraints generated is: 

pOqO + pOql + plqO + plql = 1 
plqO + plql > 0.8 
pOql + plql > 0.7 

The first equation expresses the fact that probability values have to sum up to 
1. The two inequations represent constraints on the worlds in which p holds and 
worlds in which q holds, respectively. 

Solving the constraint (in) equations determines which are the values in set 
F that obey the constraints imposed by the formulas and can be, consequently, 
applied to the set of worlds. Therefore, the decision procedure turns out to be 
a mechanism for finding all the possible probability assignments for the set of 
possible worlds that would satisfy the specified formulas, as long as this set of 
formulas is consistent. Otherwise, no possible assignment exists. 

As mentioned above the decision procedure converts the set of formulas into 
a set of constraint (in) equations. Identifying the propositional symbols is essen- 
tial for determining the inequation components, and the number of components 
grows exponentially in the number of propositional symbols. Each formula de- 
termines which components constitute each inequation. Finally, the inequations 
are produced and solved. 




Practical Reasoning for Uncertain Agents 



91 



Theorem 3 (Decision Procedure). A formula p in PpKD)5 is satisfiable 
if, and only if, there is a solution for the set of (in) equations generated from 
within the domain F. 



5 Example 



We present a simple example to show what an agent specification might look like 
in the PpKDf5 language. This is a variety of the the common “travel agent” 
scenario whereby once the travel agent believes you might be interested in a 
holiday, (s)lre sends you information. The basic formulas are given as follows (the 
finiteness of the domain ensures that this example can indeed be represented in 
a propositional language). 



A. ask(you,x) — > P 0 - g go(you,x), i. e., “if you ask for information about the 
destination x, then I believe that you wish to go to x with probability greater 
than, or equal to, 0.8” 

B. Pf [go(you,x) buy (you, holiday, x)\, i. e., “I believe that, if you wish to 

go to x, then you will buy a holiday in x” 

C. Pq 5 buy (you, holiday, x) —> sendinfo(you, x ), i. e., “if I believe that you will 
buy a holiday for x with probability greater than 0.5, I send information 
about holidays at x" 

D. ask(you,x), i. e., “you ask for information on destination x” 

From D and A and Rl we have: Pp g , go(you,x) 

From Rest, A3 and item B: Pf 7 , buy (you, holiday, x) 

From Res2 and T 6: Pff ^buy (you, holiday ,x) 

From Res3 and item C: sendinfo(you, x) 

Referring to the decision procedure execution, there are three 

evaluated (the ones that express degrees of beliefs): 

1- -P 0 “8 9°(y°u, X) (from A) 

2. Pf [go(you, x) — > buy(you, holiday, a;)] (from B) 

3. Pq 5 buy (you, holiday ,x) (from C) 



(Rest). 
(Res2) . 
(Res3) . 

formulas to be 



We obtain 6 solutions when solving the first two rules. From this set, all solutions 
satisfy the antecedent of the third rule (as would be expected by the formal 
proof given above). Which means that, whatever solution is chosen as a possible 
value assignment, the antecedent of rule C is true. Or, independently of the 
assignment, sendinfo(you, x ) is a logical consequence of the knowledge theory, 
and six assignments can be considered as options when building a model for the 
agent specification. 

In this case, the six assignments for [BOGO, B0G1, B1G0, B1G1] are: 
[0,0, 0,1], [0, 0, .1, .9], [0, 0, .2, .8], [.1,0,0, .9], [.1,0, .1, .8] and [.2, 0,0, .8] 

(where ”J3” represents buy(...) and ”G” go(...)). 




92 



N. de C. Ferreira, M. Fisher, and W. van der Hoek 



5.1 Limiting F 

In this section, we elaborate on ways to automatically generate an appropriate 
base set for a formula. In particular, we will look at sets F that are generated 
by some number \ . For such an F, we will write F = h. In general we have 
that satisfiability is preserved when considering bigger sets F: if F C F', then 
V f 1CD 45- satisfiability implies V F >1CD 45- satisfiability. As a consequence, we 
have that a formula <p is PFAlMS-satisfiable for some F if, and only if, it is 
PF'AIMS-satisfiable for some generated F'. So, given a formula <p, can we gen- 
erate a F which is sufficient for satisfiability of pi If we succeed in this, the user 
of the specification language P F KDj5 need not bother about a specific F, but 
instead can leave the system to generate it. 

To get a feeling for how sensitive the matter of satisfiability is against the 
choice of F, suppose we have three atoms p, q and r, and let L(p , q, r) be the 
set of conjunctions of literals over them: L = {(-i)p A (~>)q A (-i)r} and let our 
constraint p be: 

A (10) 

/ ip£L(p,q,r) 

If F = there is no model for (10), since every combination of atoms 0 would 
have a probability of ( , giving the disjunction \J v . £ L(p 1 q,r)ip a ‘probability’ 
of 2, which is, of course, not possible. One easily verifies that (10) is satisfiable 
for a set F generated by g iff d > 8, giving enough ‘space’ for each of the ^’s. 

A range F with few elements easily gives rise to unsatisfiability. Axiom A1 
forces one to make ‘big jumps’ between constraints: if we have Pp’.p for a certain 
r,; £ F, we are forced to assign p at least the next probability in F, viz., r j+i. 

We now sketch a way to construct an F from the formula ip in a most cautious 
way. Consider the formula p. Rewrite all the occurrences of P^ip in p in such a 
way, that they all have a common denominator d: every P~ip gets rewritten as a 
Pm ip. Let xi , . . . x m be all the boolean combinations of atoms from p (to = 2 k ). 
The formula p gives rise to a number of v inequalities I : 



1(d) 



K hXi + Ki 2 X 2 + . . . + Kl m ~l -f 

k V 2 x 2 T • • • T n Vrn 



Since solutions of 1(d) are obtained by taking linear combinations of the inequal- 
ities: it is clear that they are (as linear combinations of the right hand sides) of 
the form 5, for some n. Now, take the first Xi that is not yet determined, say 
the tightest constraint on Xi says that is is between and for certain n 
and t. Then we can safely add the constraint x t = to 1(d) and obtain a set 
of inequalities I (2d). Doing this iteratively gives us the following: 



Conjecture 1. Let ip be a formula in our language, with denominator d. Then, <p 
is satisfiable for some F iff p if satisfiable for F,„ = , , where k is the number 

of atoms occurring in <p. 




Practical Reasoning for Uncertain Agents 



93 



6 Related Work and Conclusion 

Several methods have been developed to deal with uncertain information, often 
being split between numerical (or quantitative) or symbolic (or qualitative) ones 
[12]. PpKD45 is a system that combines logic and probability. In this sense, it 
is related to other work that showed how this combination would be possible 
in different ways [6]. One of those possible approaches is the interpretation of 
the modal belief operator according to the concept of ‘likelihood’ (as in [8]). 
In this logic, instead of using numbers to express uncertainty one would have 
expressions like “p is likely to be a consistent hypothesis” (as a state is taken as 
a set of hypotheses “true for now”). That is, a qualitative notion of likelihood 
rather than explicit probabilities. 

PpKD45 was designed for reasoning with (exact) probabilities. Its Proba- 
bilistic Kripke Model semantics is similar to the one presented in [5,4]. In their 
formalism, a formula is typically a boolean combination of expressions of the 
form a\w{ip\) + ... + akw(ifik) > c, where Oi,...,afc,c are integers, and each (pi 
is propositional. The restriction of having ip's as purely propositional does not 
apply to PpKD45. Besides, the system in [5,4] includes, as axioms, all the for- 
mulas of linear inequalities; consequently, their proofs of completeness rely on 
results in the area of linear programming. Our logic is conceptually simpler. Fi- 
nally, PpKD^5 differs mainly from other systems for representing beliefs and 
probability by allowing only a finite range of probability values, an assumption 
that at the same time imposes restrictions about the values that can be assigned 
to the possible worlds and permits the restoration of compactness for the logic. 

Maybe the work closest to ours is that of [11]. It considers languages for first 
order probabilities, and the compactness of PpKD^5 easily follows from [11, 
Theorem 11]. They also consider the case in which all the worlds are assigned 
the same probability function, but for a language that forbids iteration. 

In this paper, we presented PfKD^5, a simple and compact logic combining 
modal logic with probability. Despite the inclusion of new axioms and slight 
changes in the semantics, it was shown how the logic preserves important results 
about soundness, completeness, finite model and decidability of the previous sys- 
tem PpD [9]. In addition, new results about nested beliefs have been presented, 
a decision procedure for the logic has been developed, and brief examples were 
given showing how the language can serve as an appropriate basis concerning 
the informational attitudes of an agent specification language. In summary, we 
proposed not only a complete axiomatization for the logic, but also a decision 
procedure that permits us to verify satisfiability of PpKD45-iorumla,s. 

The use of a finite range F of probability values is a peculiar, and important, 
property of our logic. Although the use of a base F causes a logical restriction, it 
seems possible to chose its granularity according to the intended agent’s applica- 
tion. Besides, as discussed earlier in Section 1, the compactness that it brings has 
significant benefits. Furthermore, a finite range of probability values reduces the 
computational effort required when building a model for the agent description. 




94 



N. de C. Ferreira, M. Fisher, and W. van der Hoek 



Finally, this work on PpKDf5 represents one step towards our main goal: an 
agent programming language capable of specifying and implementing agents that 
deal with uncertain information, together with new mechanisms for handling 
such uncertainty in executable specifications. Future work will concentrate on 
developing an executable framework combining the probabilistic approach of 
PpKDf5 with the dynamic approach of and Temporal Logics. This will allow 
us to capture, in our simple an compact approach, the key aspects of uncertain 
agents working in an uncertain world. 



Acknowledgements. The first author gratefully acknowledges support by the 
Brazilian Government under CAPES-scholarship. The authors thank the anony- 
mous referees for their relevant comments. 



References 

1. N. de Carvalho Ferreira, M. Fisher and W. van der Hoek, A Simple Logic for 
Reasoning about Uncertainty , Technical Report, Department of Computer Science, 
University of Liverpool, (2004). Online version: 

http : //www. esc . liv . ac . uk/~niveacf /techreport/ 

2. D. Driankov, ‘Reasoning about Uncertainty: Towards a many-valued logic of belief’, 
IDA annual research report, 113-120, Linkoping University, (1987). 

3. M. Fattarosi-Barnaba and G. Amati. ‘Modal operators with probabilistic interpre- 
tations, T. Studia Logica, 48:383-393, 1989. 

4. R. Fagin and J.Y. Halpern, ‘Reasoning about knowledge and probability’, Journal 
of the ACM, 41(2), 340-367, (1994). 

5. R. Fagin, J.Y. Halpern, and N. Megiddo, ‘A logic for reasoning about probabilities’, 
Information and Computation, 87(1), 277-291, (1990). 

6. T. Fernando, ‘In conjunction with qualitative probability’, Annals of Pure and 
Applied Logic, 92(3), 217-234, (1998). 

7. P. Hajek and T. Havranek, Mechanizing Hypothesis Formation, Springer, 1978. 

8. J.Y. Halpern and M.O. Rabin, ‘A logic to reason about likelihood’, Artificial In- 
telligence, 32(3), 379-405, (1987). 

9. W. van der Hoek, ‘Some considerations on the logic PFD’, Journal of Applied Non 
Classical Logics, 7(3), 287-307, (1997). 

10. J.-J.Ch. Meyer and W. van der Hoek, Epistemic Logic for AI and Computer Sci- 
ence, Cambridge University Press, 1995. 

11. Z. Ognjanovic and M. Raskovic, ‘Some first-order probability logics’, Theoretical 
Computer Science, 191-212, (2000). 

12. S. Parsons and A. Hunter, ‘A review of uncertainty handling formalisms’, in Ap- 
plications of Uncertainty Formalisms, A. Hunter and S. Parsons (eds), Springer 
(1998). 




Modelling Communicating Agents in Timed Reasoning 

Logics 



Natasha Alechina, Brian Logan, and Mark Whitsey 

School of Computer Science and IT, University of Nottingham, UK. 
{nza,mtw,bsl}@cs .nott .ac.uk 



Abstract. Practical reasoners are resource-bounded — in particular they require 
time to derive consequences of their knowledge. Building on the Timed Reasoning 
Logics (TRL) framework introduced in [1], we show how to represent the time 
required by an agent to reach a given conclusion. TRL allows us to model the kinds 
of rule application and conflict resolution strategies commonly found in rule-based 
agents, and we show how the choice of strategy can influence the information an 
agent can take into account when making decisions at a particular point in time. 
We prove general completeness and decidability results for TRL, and analyse the 
impact of communication in an example system consisting of two agents which 
use different conflict resolution strategies. 



1 Introduction 

Most research in logics for belief, knowledge and action (see, for example, [12,6,10, 
1 1,16,17,5,13,20,18]) makes the strong assumption that whatever reasoning abilities an 
agent may have, the results of applying those abilities to a given problem are available 
immediately. For example, if an agent is capable of reasoning from its observations 
and some restricted set of logical rules, it derives all the consequences of its rules 
instantaneously. 

While this is a reasonable assumption in some situations, there are many cases 
where the time taken to do deliberation is of critical importance. Practical agents take 
time to derive the consequences of their beliefs, and, in a dynamic environment, the 
time required by an agent to derive the consequences of its observations will determine 
whether such derivations can play an effective role in action selection. Another example 
involves more standard analytical reasoning and a classical domain for the application 
of epistemic logics: verifying cryptographic protocols. An agent intercepting a coded 
message usually has all the necessary “inference rules” to break the code. The only 
problem is that if the encoding is decent, it would take the intercepting agent millennia 
to actually derive the answer. On the other hand, if the encryption scheme is badly 
designed or the key length is short, the answer can be derived in an undesirably short 
period of time. The kind of logical results we want to be able to prove are therefore of 
the form agent i is capable of reaching conclusion (j> within time bound t. 

In this paper we show how to model the execution of communicating rule-based 
agents using Timed Reasoning Logics (TRL). TRL is a context-logic style formalism 
for describing rule-based resource bounded reasoners who take time to derive the conse- 
quences of their knowledge. This paper builds on the work in [1], where we introduced 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, pp. 95-107, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




96 



N. Alechina, B. Logan, and M. Whitsey 



TRL. In that paper, we described how our logic can model different rule application 
and conflict resolution strategies, and proved soundness and completeness of the logic 
TRL(STEP) which captures the all rules at each cycle rule application strategy used by 
step logic [3] (for another example of a TRL(STEP) logic, see [21]). We also showed 
how to model a single rule at each cycle strategy similar to that employed by the CLIPS 
[19] rule-based system architecture, and sketched a logic TRL(CLIPS). In this paper, we 
prove a general soundness and completeness result for TRL, from which soundness and 
completeness of TRL(CLIPS) follows. We study TRL(CLIPS) in more detail and give 
a detailed example involving two communicating agents using different CLIPS conflict 
resolution strategies. 

2 Model of an Agent 

In this section we outline a simple model of the kind of rule-based agent whose execution 
cycle we wish to formalise, 

A rule-based agent consists of a working memory and one or more sets of condition- 
action rules. The working memory constitutes the agent’s state, and the rules form the 
agent’s program. We assume that agents repeatedly execute a fixed sense-think-act cycle. 
At each tick of the clock, an agent senses its environment and information obtained by 
sensing is added to the previously derived facts and any a priori knowledge in the 
agent’s working memory. The agent then evaluates the condition-action rules forming 
its program. The conditions of each rule are matched against the contents of the agent’s 
working memory and a subset of the rules are fired. This typically adds or deletes 
one or more facts from working memory and/or results in some external actions being 
performed in the agent’s environment. For the purposes of this paper the only external 
action we assume is a ‘communication’ action which allows agents to communicate facts 
currently held in working memory to other agents. 

Our interest here is with the rule application and conflict resolution strategy adopted 
by the agent. In general, the conditions of a rule can be consistently matched against 
the items in working memory in more than one way, giving rise to a number of distinct 
rule instances. Following standard rule-based system terminology we call the set of rule 
instances the conflict set and the process of deciding which subset of rule instances are 
to be fired at any given cycle conflict resolution. Agents can adopt a wide range of rule 
application and conflict resolution strategies. For example, they can order the conflict 
set and fire only the first rule instance in the ordering at each cycle, or they can fire all 
rule instances in the conflict set on each cycle once, or they can repeatedly compute the 
conflict set and fire all the rule instances it contains set until no new facts can be derived 
at the current cycle. We call these three strategies single rule at each cycle, all rules at 
each cycle, and all rules to quiesence respectively. 

3 Timed Reasoning Logics (TRL) 

The literature contains many attempts at providing a logic of limited or restricted rea- 
soning. However most of these do not explicitly take account of time. For example, 
Levesque’s [12] logic of implicit and explicit belief restricts an agent’s explicit beliefs 




Modelling Communicating Agents in Timed Reasoning Logics 



97 



(the classical possible worlds notion) by allowing non-classical (either incomplete or 
impossible) worlds to enter an agent’s epistemic accessibility relation. Although agents 
need not then believe all classical tautologies, they remain perfect reasoners in relevance 
logic. In [7] Fagin & Halpern propose an alternative approach to restricting possible 
worlds semantics which involves a syntactic awareness filter, such that an agent only 
believes a formula if it (or its subterms) are in his awareness set. Agents are modelled as 
perfect reasoners whose beliefs are restricted to some syntactic class compatible with the 
awareness filter. Konolige [10] represents beliefs as sentences belonging to an agent’s 
belief set, which is closed under the agent’s deduction rales. A deduction model assigns 
a set of rales to each agent, allowing representation of agents with differing reasoning 
capacities within a single system. However the deduction model tells us what a set of 
agents will believe after an indefinitely long period of deliberation. 

The only logical research we are aware of which represents reasoning as a process 
that explicitly requires time is step logic [2,4,3]. However, until recently, step logic 
lacked adequate semantics. In [15] Nirkhe, Kraus & Perlis propose a possible-worlds 
type semantics for step logic. However this re-introduces logical omniscience: once an 
agent learns that <j>, it simultaneously knows all logically equivalent statements. In more 
recent work [9], Grant, Kraus & Perlis propose a semantics for step logic which does not 
result in logical omniscience, and prove soundness and completeness results for families 
of theories describing timed reasoning. However, their logic for reasoning about time- 
limited reasoners is first-order and hence undecidable (even if the agents described are 
very simple). 

The approach we describe in this paper. Timed Reasoning Logics (TRL), avoids the 
problem of logical omniscience and is at the same time decidable. TRL is a context- 
logic style formalism for describing rale-based resource bounded reasoners who take 
time to derive the consequences of their knowledge. Not surprisingly, in order to avoid 
logical omniscience, a logic for reasoning about beliefs has to introduce syntactic objects 
representing formulas in its semantics. In [9], domains of models of the meta-logic for 
reasoning about agents contain objects corresponding to formulas of the agent’s logic. 
We have chosen a different approach, where models correspond to sets of agent’s states 
together with a transition relation (similar to [5]). States are identified with finite sets of 
formulas and the transition relation is computed using the agent rules. 

This paper builds on the work in [1], where we introduced TRL. In this section, we 
give a slightly more general formulation of TRL than that given in [1], and prove its 
soundness and completeness. 

3.1 TRL Syntax 

Our choice of syntax is influenced both by step logics and context logics and by Gabbay’s 
Labelled Deductive Systems [8]. To be able to reason about steps in deliberation and 
the time deliberation takes, we need a set of steps, or logical time points, which we will 
assume to be the set of natural numbers. To be able to reason about several agents, we 
also have a non-empty set of agents or reasoners A = {a, b , c, i, j, ii, . . . , i n ■ ■ •}■ 

Different agents may use different languages. To be able to model changes in the 
agent’s language, such as acquiring new names for things etc., we also index the language 
by time points: at time t, agent i speaks the language C\. 




98 



N. Alechina, B. Logan, and M. Whitsey 



Well formed formulas in the agent’s languages C\ are defined in the usual way. For 
example, if (the agent a’s language at time 0) is a simple propositional logic with 
propositional variables po, p \ , . . . , p n , then a well formed formula 0 of £g is defined as 

0 = PiH0|0 ->• 0b A 0|0 V 0 

As in context logic, we use labelled formulas to distinguish between beliefs of differ- 
ent agents at different times. If i is an agent, t is a moment of time, and 0 a well-formed 
formula of the language C\, then (*, t ) : 0 is a well-formed labelled formula of TRL. 
The general form of an inference rule in TRL is: 

(?'i , f ) . 0i , . • • , (i n , f ) . (j> n 
(i,t + 1) : 0 

with a possible side condition of the form: provided that (ii , f ) : 0i , . . . , (i n , t ) : 0„ and 
the set A t of all formulas derived at the previous stage in the derivation (see Definition 1 
below) satisfy some property. For example, a side condition for a defeasible rule may 
be that some formula is not in A t . 

A significant restriction on the format of possible TRL rules is that only finitely many 
formulas labelled t should be derivable starting with a finite set of labelled formulas r, 
for any t. For example, supposing we had an operator If, for “agent a believes that”, 
then the following negative introspection rule: 

(a, t + 1) : 'B a (j) given that (a, t) : 0 £ A t 

cannot be introduced in unrestricted form since it would generate infinitely many for- 
mulas at step t + 1. 

A simple example of a TRL rule is an inference rule corresponding to a rule in the 
agent’s program. If agent a’s program contains the rule 

A(x),B(x) C(x) 

then the corresponding inference rule in TRL would be 

( a,t ) : A{x), (a,t) : B(x) 

( a,t+ 1) : C(x) 

Depending on the agent’s rule application strategy, the TRL inference rule may have 
a side condition stating, for example, that it may only be applied if no other rule is 
applicable. 

Another kind of rule which we will see later is used to model communication between 
agents. For example, 

(M) • 0 
(M + 1) : B a (j) 

expresses the fact that whenever a believes 0, at the next step b believes B a <f>. In this paper, 
we do not explicitly model message passing. Instead we assume that whenever an agent 
derives a fact of a certain form it communicates this fact to other agents. The message 
arrives at the next tick of the clock, and is ‘observed’ immediately. In the example above. 




Modelling Communicating Agents in Timed Reasoning Logics 



99 



whenever a derives <fi, it sends a message containing (f> to b , which arrives at t + 1 . This 
model corresponds to perfect broadcast communication with a fixed one tick delay. 

The derivability relation in a TRL logic may be non-monotonic due to the agent’s 
rule application strategy (e.g. only one of the rules is applied at each cycle) or to the 
presence of defeasible rules. Before we give a formal definition of derivability, we need 
a couple of auxiliary definitions. Let R be a set of TRL rules and A a finite set of labelled 
formulas. Then by R(A) we denote the set of all labelled formulas derivable from A 
by one application of a rule in R. Formally, 11(A) is the set of all labelled formulas 
( i , t + 1) : 4> such that there is a rule in R of the form 

{ill f) • (fill • • • i {ini i') • 4*n 

{i,t+l):<f> 

and : <f>i, . . . , (i n , t) : (j> n £ A and any side condition of the rule, holds for 

{ii,t) : : (j> n and A. Finally, given a set of labelled formulas r, we write 

A for the subset of / ' labelled by time point k (formulas in /' of the form (j. k) : for 

any agent j). 

Definition 1. Given a set of TRL rules R, a labelled formula {i, t) : f is derivable using 
Rfrom a set of labelled formulas F: 

r Rr {i,t) : f 

if there exists a sequence of finite sets of labelled formulas 

A, A, ■ • • , A, 



such that {i, t) : f £ A and 

1. A is the union of Jo and all axioms in R labelled by time 0 (i.e., {j, 0) for some 
agent j). 

2. A is the union of A and f?(A-i)- 

3.2 TRL Semantics 

We identify the local state of agent i at time t, m\, with a finite set . . . , cf> n } of 
formulas of the agent’s language at time t, i.e. C\ . At this point, we don’t require anything 
else in addition to finiteness. In particular, this set may be empty or inconsistent. 

A TRL model is a set of local TRL states. Each local state in a TRL model is indexed 
by an element of the index set I = A x N, which is the set of pairs (i. t), where i is 
an agent and t is the step number. In addition, a TRL model should satisfy constraints 
which make it a valid representation of a run of a multi-agent system. To formulate 
those constraints, we need the additional notions of observation and inference, which 
constrain how the next state of an agent will look. 

Each agent has a program — a set of rules which it uses to derive its next state given 
its current state and any new beliefs it obtains by observing the world. We therefore equip 
each model with an obs function and a set of inf i functions (one for each agent i). Intu- 
itively, obs models observations, which we take to include inter-agent communication, 




100 



N. Alechina, B. Logan, and M. Whitsey 



and takes a step t and an agent i as arguments and returns a finite set of formulas in the 
agent’s language at that step. This set is added to the agent’s state at the same step (we 
thus model observations as being believed instantaneously). Each in/, models agent i’s 
computation of a new state by mapping a finite set of formulas in the language C\ to 
another finite set of formulas in the language C\ +l . Intuitively, inf, takes the tokens in 
agent i’s state at time t, applies the rules in i’s program to them to obtain a new set of 
tokens, which, together with i’s observations at time t + 1, constitute its state at time 
t + 1. 

Definition 2 (TRL Model). Let Abe a set of agents and {C\ : i € A, t € N} a set of 
agent languages. A TRL model M is a tuple ( obs , inf ^ {m l t : i G A,t G N}) where obs 
is a function which maps a pair (i, t ) to a finite set of formulas in C\ , inf i is a function 
from finite sets of formulas in C\ to finite sets of formulas in C\ + and each rn l t is a 
finite set of formulas in C\ such that m\ +1 = inf fimf) U obs(i, t + 1). 



Definition 3 (Satisfaction and Logical Entailment). A labelled formula (i,t.) : <f> is 

true in a model , written M |= (i,t) : (f>, iff <f> G m\ ( the state indexed by (i,t) in M 
contains (j>). A labelled formula ( i,f ) : (j> is valid, |= (i,f) : f, iff for all models M, 
M |= (i, t ) : <f>. Let r be a set of labelled formulas, r logically entails (i, t ) : f, 
T |= (i, f) : f, if in all models where I is true, (i, t) : (f> is true. 

3.3 Soundness and Completeness of TRL 

In this section we prove a general soundness and completeness result for TRL systems. 
We are going to show that given a set of TRL rules R (the only condition on R is that 
starting from a finite set of premises, it only produces a finite set of consequences labelled 
t, for any t) and a set of TRL models S, describing possible runs of a multi-agent system, 
R is sound and complete with respect to S if, and only if, S is the set of models which 
conform to R in the sense defined below. 

Definition 4. A TRL model M conforms to a set of TRL rules R if for every rule in R 
of the form 

{h ) f) ■ fil i • • • 5 {fni t) ■ fin 

(i,t+ 1) : tp 

possibly with some side condition on A t , M satisfies the property that if for all premises 
of the rule, fik € mf, and the side condition of the rule holds for U ; eA m t substituted 
for A t , then € mj +1 . 

Before proving the main theorem, we need one more notion, similar to the notion of 
a knowledge-supported model in [9]: 

Definition 5. [Minimal Model] A TRL model AT conforming to a set of TRL rules R is 
a minimal model for a set of labelled formulas r if for every i , t and <p, (j> G rri l t iff one 
of the following holds: 




Modelling Communicating Agents in Timed Reasoning Logics 101 



1. there is a rule in R of the form 

(zi , t) . (/>i , . . . , ( in i l') • fin 

(i,t + l) : <j) 

for all premises of the rule, £ m t-i an d the side condition of the rule holds for 
UjeA m i - 1 °d ier words, f is in ni l t because the model conforms to R) 

2. or (i,t) : <p £ T in which case <f> £ obs{m \ ). 

A minimal model for r only satisfies the formulas in /' and their logical conse- 
quences. 

Lemma 1. Let M be a minimal model for l conforming to R. Then for every formula 
<t>,fi£m l t iffR b fl ( i , t) : <\>. 

Proof The proof goes by induction on t. If t = 0, then the only way f £ ?n’ 0 is because 
f £ obs{i, 0) hence (z, 0) : (j> £ T so T \~r (z, 0) : </>. Inductive hypothesis: suppose 
that for all agents j and all s < t, <f> £ m J s iff r \~r {j, s) : <j>. Let f> £ m\ +1 . Then 
either (i, t + 1) : f> £ T hence r b# (*, t + l):<j>,or there is a rule in R of the form 

(fl i l') • fl ; • • * ; f) ■ 4*n 

(i,t+ 1) : ip 

such that if> = <fi and <j>i, . .. ,<j> n £ ni\ (and the side condition of the rule holds for 
the set of formulas in the union of all states at time t). By the inductive hypothesis, 
r b r (jfc, t) : ifik. Hence by this same rule, r b^ (i, t + 1) : f. 

Theorem 1. Given a set ofTRL rules R, for any finite set of labelled formulas l ' and a 
labelled formula <f, r b# f iff T \=n f> where 1Z is the set of all models conforming to 
R. 

Proof Soundness (T \=n </> => T b r <f > ) is standard: clearly, in a model conforming 
to R the rules in R preserve validity. 

Completeness: suppose /’ \=- R f. Consider a minimal model for /’, M r , conforming 
to R. Since /’ \=-ji f and our particular model Mp conforms to R and satisfies 
Mp (= 4>- From Lemma 1, C \~r <t>- 

Theorem 2. Given a set ofTRL rules R, for any finite set of labelled formulas l ' and a 
labelled formula (f>, it is decidable whether T b^ f or T \=r </> where 1Z is the set of 
all models conforming to R. 

Proof From Theorem 1 above, the questions whether r b# (i,£) : (j> and whether 
r \=u ( i , t) : f, where TZ is the set of models conforming to R, are equivalent. Consider 
a minimal model Mp for r. If T \=k ( i , t) : <j), then (f> £ m\ in Mp. On the other hand, 
from Lemma 1, if </> £ m\ then T b/j (z, t ) : f. Hence (j> £ m\ iff r b/j (z, t ) : f iff 
r \=n (z, t) : (j). 

It is easy to see that given that T is finite and rules in R only produce a finite number 
of new formulas at each step, the initial segment of M (up to step t) can be constructed 
in time bounded by a tower of exponentials in |C| of height t (but nevertheless bounded). 
Then we can inspect to see if <j) is there. 




102 N. Alechina, B. Logan, and M. Whitsey 

4 TRL(CLIPS) 

As an example of a logical model of an agent based on TRL, we show how to model 
a simple system consisting of two communicating agents. The agents use a CLIPS- 
style [19] single rule at each cycle rule application strategy. However each agent uses 
a different CLIPS conflict resolution strategy. We show that the adoption of different 
conflict resolution strategies by each agent can result in a reduction in the time required 
to derive information for action selection. 

CLIPS has been used to build a number of agent-based systems (see, e.g., [14]). 
In CLIPS each rule has a salience reflecting its importance in problem solving. At 
each cycle, all rules are matched against the facts in working memory and any new 
rule instances are added to the conflict set. Rule matching is refractory, i.e., rules don’t 
match against the same set of premises more than once. New rule instances are placed 
above all rule instances of lower salience and below all rules of higher salience. If rule 
instances have equal salience, ties are broken by the conflict resolution strategy. CLIPS 
supports a variety of conflict resolution strategies including depth, breadth, simplicity, 
complexity, lex, mea, and random. The default strategy, called depth, gives preference 
to new rule instances; breadth places older rule instances higher. Once the conflict set 
has been computed, CLIPS fires the highest ranking rule instance in the conflict set at 
each cycle. 

Consider an agent with the following set of rules using the depth conflict resolution 
strategy: 

R1 : tiger (x) -> large-carnivore (x) 

R2 : large-carnivore (x) -> dangerous (x) 

R1 has greater salience than R2. If the agent’s working memory contains the following 
fact: 

0 : tiger (c) 

then at the next cycle the agent would derive 
1 : large-carnivore (c) 

Assume that at this cycle the agent observes a second tiger, and a corresponding fact is 
asserted into working memory: 

1 : tiger (d) 

Instances of R1 have greater salience than instances of R2, so on the following cycle the 
agent will derive 

2 : large-carnivore (d) 

Both "large-carnivore(c)” and “large-carnivore (d)” match R2, but 
“large-carnivore (d)” will be preferred since it it is a more recent instance of 
R2 than “large-carnivore (c)”. On the following cycle the agent will derive 



3 : dangerous (d) 




Modelling Communicating Agents in Timed Reasoning Logics 103 



Finally the agent derives: 

4 : dangerous (c) 

This is trivial example. Flowever, in general, the time at which a fact is derived can be 
significant. For example, in developing an agent we may wish to ensure that it responds 
to dangers as soon as they are perceived rather than after classifying objects in the 
environment. In our short example, the delay in identifying danger is just one step, but it 
is easy to modify the example to make the delay arbitrarily long (by introducing n new 
tigers instead of one at cycle 1). 

It is easy to see that the TRL logic corresponding to the single rule at each cy- 
cle strategy is non-monotonic. For instance, in the example above, {o : tiger (c)} h 
2 : dangerous (c), but {0 : tiger (c) , 1 : tiger (d) } Yf 2 : dangerous (c) . 

To reflect salience of rules, we assume that there is a partial order <j r on the set of 
rules IZj = { R \ , . . . , R n } which correspond to the rules of agent j’s program. Note that 
the logic will contain more rules describing agent j in addition to 7 Zj; e.g. rules which 
model observation, or the fact that formulas persist in the state. To determine which rule 
instance will be fired at a given step in a TRL(CLIPS) derivation, we need to compute a 
‘conflict set’ of sets of premises matching rules in IZj , order it by a total order, and fire 
the rule with the premises which come top in that order. The total order on the conflict 
set is determined by the agent’s conflict resolution strategy. 

To be more formal, let A t be the set of all formulas derived at step t. Let C ht be the 
the conflict set for j at t, namely Cj t t = {{(j, t) '■ <t> i, . . . , ( j , t) : (f> n ,Ri) : 

(j, t ) : (/>!...„ S A t , Ri G Rj, and ( j , t) : 0i, . . . , ( j , t) : (j) n match i?J. 

Define the order <depth (depth order on Gj. t , to be read as ‘lower in the depth order’) 
as follows: 

(O') 0 ■ ^*1) • • • ) 0) 0 ■ fini Ri) G depth (0) 0 • 0l) • • • > 0) 0 • On) Rm) 



iff 

1. Ri < r j R m (Ri has lower salience); or 

2. Ri = r j R rn , but ((j, t) : <f> i, . . • , (j, t) : (f> n , Ri) is an earlier rule instance, that 

is, for some Z\ s with s < t, (( j , s) : </> i, . . . , (. j , s) : 4> n , Ri) G C j>s and (( j , s) : 
0l) * * • ) 0) * On) Rm) ^ Cj,si t)r 

3. 0, t) '■ 0 1 , • ■ • , 0, 0 : On and (j, t) : ( j , t) : ip n match rules of the same 

salience and were added to the conflict set at the same time, but 

(j, t) : (f>i, , ( j , t) : <f> n is lower in some arbitrary, e.g., lexicographic, order. 

For the breadth order <breadth> we reverse the second clause of the definition; now 
the premises which belong to a conflict set C hs for the earliest time s are higher in the 
order. 

We introduce meta-logical abbreviation topj^epthi^i, ■ ■ ■ , 0m ; At) and 
topj,br ea dth{<t>i,---,<t>m,At ) to indicate that the set of premises Ot , • • ■ , Om is 
the highest in the <depth ( <breadth ) order among the conflict set Cj t of formulas from 
At- 

Finally, we need to account for the refractoriness of the CLIPS rule application 
strategy: any rule instance is only used once in the TRL(CLIPS) derivation. To be precise, 




104 



N. Alechina, B. Logan, and M. Whitsey 



for any rule Rj and a set of premises (i,t) : (f> 1 , . . . , (i, t) : p n matching this rule, if at 
some step s < t, the rule Rj was fired with a set of premises which were the same but for 
step label (e.g. (i, s) : </>i, . . . , ( i , s) : (p n ), then (i, t) : <p\, ... ,(i,t) : </>„ are excluded 
from the conflict set C', t . 

The rules of a single rule at each cycle agent i using the depth strategy then become 
(for ip): 

(i,t) ■■ fa,..., (i,t) : p n ,A t 
(i,t + 1) : ip 

provided topi^epthdi, t) : (pi, . . . : <p n , A t ), namely the premises of the rule are 

maximal in the <depth order in the conflict set for i at t. In what follows, we refer to 
such a proviso as ‘standard proviso for depth order’. For example, the agent a from the 
example above has a rule: 

(a,t) : Tiger (x), A t 
( a,t+ 1) : Large-Camivore(x) 

provided top a , de p th ({a,t) : Tiger(x),A t ) 

For monotonic agents (who keep all the facts they derived earlier) we have an addi- 
tional monotonicity rule which does not have a side condition, is always applicable, and 
is excluded from the ordering of the internal agent rules proper: 

(i,t) ■■ <P 
(i,t+l):<p 

To give an example of an observation rule, suppose that the agent a gets some of its 
information about the world from agent b. In particular, if b decides that something is 
nearby, then at the next step a also decides that it is nearby: 

(b,t) : Near(x) 

(a, t + 1) : Near(x) 

This rule also does not have any side conditions. 

The notion of derivation in TRL(CLIPS) is a special case of TRL derivation as given 
in Definition 1 . 

4.1 Example 

In this section we give a worked example of a derivation in TRL(CLIPS). Our example 
involves two agents, a and b. They have the same set of rules with the same salience 
order and start with the same set of observations, but a uses the depth strategy, while b 
uses the breadth conflict resolution strategy. We show that they both can reach the same 
conclusion, (classify a tiger as a dangerous object), however if they communicate, they 
can reach this conclusion faster. 

The rules corresponding to the program rules of agent a are (with the standard proviso 
for depth order): 

(a,f) : Large(x), (a,t) : Camivore{x), (a,t) : Near(x), (a,t) : Free(x),A t ^ 

( a,t+ 1) : Dangerous(x ) 




Modelling Communicating Agents in Timed Reasoning Logics 105 



(a,t) : Bengal -Tiger(x), A t 
(a, t + 1) : Tiger(x ) 

(M) : Tiger (x), A t 
(a, t + 1) : Large(x) 

(a it) : Tiger (x), A t R/[ 

(a, t + 1) : Carnivore(x) 

( a, f ) : Distance < 5m(x),A t 
(a, t + 1) : Near(x) 



(a,t) : Caged(x ), A t 

(a, t + 1) : Free(x) 



The rules for agent 6 are the same, with top a , depth replaced with topb, breadth- The 
salience order on rules is R1 > r R2 > r { R3 . /?! , R5, i?6}. 

In addition, both agents have the monotonicity rule and the following communication 
rules: 

(a, t) : Large(x) 

( b , t+1) : Large(x) 



( a,t ) : Carnivore(x) 

( b,t+ 1) : Carnivore(x) 



( b,t ) : Near(x) 
(a, t + 1) : Near(x) 



(b,t) : Free(x) 

( a,t+ 1) : Free{x) 

Suppose both agents start with the same set of observations, corresponding to a 
sighting of a Bengal tiger at a distance less than 5 meters, and apparently uncaged: 

(a, 0) : Bengal-Tiger(c), (a, 0) : Distance < 5m(c),(a,0) : -^Caged(c), (6,0) : 
Bengal-Tiger(c),(b, 0) : Distance < 5m(c), (b, 0) : ^ Caged (c). At this step, both 
agent’s conflict sets are the same: all formulas match one of the rules, but the highest 
salience rule is R2, in the case of a matched by (a, 0) : Bengcil-Tiger(c). The other 
two rule instances in C a , o are {cl, 0) : Distance < 5 m(c) matching R5 and (a, 0) : 
-i Caged(c) matching R6 (similarly for C^o). So at the next step, A\ contains (a, 1) : 
Bengal-Tiger(c), (a, 1) : Distance < 5 m(c), (a, 1) : ~>Caged{c), by the monotonicity 
rule, and (a, 1) : Tiger (c) by R2, and corresponding formulas for b. From step 1. the 
conflict sets of the two agents diverge: agent a places a new rules instance, (a, 1) : 
Tiger{c) which matches R3, at the top of the conflict set, while agent b favours one of 
the old rule instances, let’s say R5. The new formulas in Ao are (a, 2) : Larqe(c), and 
(6,2) : Near{c). 

At this stage, the top rule instance for a is (a, 2) : Tiger{c) matching R4, while the 
top rule instance for 6 is (6, 2) : ->Caged{c) matching R6. In addition, both agents have 
now derived formulas of the kind they communicate to each other; so at the next step, a 
will discover that c is nearby and 6 will discover that c is large. The new formulas in Z \3 

are (a, 3) : Carnivore{c), (a, 3) : Near{c), (6,3) : Freeze), (6,3) : Large{c). 




106 



N. Alechina, B. Logan, and M. Whitsey 



At the next step, both agents will acquire the facts (a, 4) : Large(c), (a, 4) : 
Carnivore(c), (a, 3) : Near(c), (a, 4) : Free(c), and will match the rule with the 
top salience, Rl, to derive (a, 5) : Dangerous (c) (similarly for b ). The reader will eas- 
ily verify that it would have taken the agents longer to derive Dangerous(c) without 
communication. 



5 Conclusion 

In this paper we showed how to model the execution of communicating rule-based agents 
using Timed Reasoning Logics (TRL). Our framework allow us to model agents at a 
fine-grained level, so that we can prove, for example, that the agent will use so many 
computation cycles to arrive at a given conclusion. 

In previous work [1], we showed how to model a single rule at each cycle strategy 
similar to that employed by the CLIPS [ 19] rule-based system architecture, and sketched 
a logic TRL(CLIPS). In this paper, we prove a general soundness and completeness re- 
sult for TRL, from which soundness and completeness of TRL(CLIPS) follows. We 
study TRL(CLIPS) in more detail and give a detailed example involving two commu- 
nicating agents using CLIPS rule application strategy. The example is quite simple, but 
it demonstrates that we can compare different agent designs and prove properties of 
various conflict resolution strategies in the presence of communication between agents. 

In the future, we plan to add a more fine-grained analysis of action and communica- 
tion to the TRL framework. It would also be interesting to investigate more systematically 
the impact of communication on the time required by agents to reach a given conclusion. 



References 

1 . N. Alechina, B . Logan, and M. Whitsey. A complete and decidable logic for resource-bounded 
agents. In Proceedings of the Third International Joint Conference on Autonomous Agents 
and Multi-Agent Systems (AAMAS 2004). ACM Press, July 2004. 

2. J. Drapkin and D. Perlis. A preliminary excursion into Step-Logics. Proceedings of the 
SIGART International Symposium on Methodologies for Intelligent Systems, pages 262-269, 
1986. 

3. J. Elgot-Drapkin, M. Miller, and D. Perlis. Memory, reason and time: the Step-Logic approach. 
In R. Cummins and J. Pollock, editors. Philosophy and AI: Essays at the Interface, pages 79- 
103. MIT Press, Cambridge, Mass., 1991. 

4. J. Elgot-Drapkin and D. Perlis. Reasoning situated in time I: Basic concepts. Journal of 
Experimental and Theoretical Artificial Intelligence, 2( 1 ):75 — 98, 1990. 

5. R. Fagin, J. Y. Halpern, Y. Moses, and M. Y. Vardi. Reasoning about Knowledge. MIP Press, 
Cambridge, Mass., 1995. 

6. R. Fagin and J.Y. Halpern. Belief, awareness and limited reasoning. In Proceedings of the 
Ninth International Joint Conference on Artificial Intelligence (IJCAI-85), pages 491-501, 
Los Angeles, CA, 1985. 

7. R. Fagin and J.Y. Halpern. Belief, awareness and limited reasoning. Artificial Intelligence, 
34:39-76. 1988. 

8. Dov M. Gabbay. Labeled Deductive Systems: Volume I - Foundations. Oxford University 
Press, 1996. 




Modelling Communicating Agents in Timed Reasoning Logics 107 



9. John Grant, Sarit Kraus, and Donald Perlis. A logic for characterizing multiple bounded 
agents. Autonomous Agents and Multi-Agent Systems, 3(4):35 1-387, 2000. 

10. K. Konolige. A Deduction Model of Belief . Morgan Kaufman, 1986. 

11. G. Lakemeyer. Steps towards a first-order logic of explicit and implict belief. In J. Y. 
Halpem, editor. Theoretical Aspects of Reasoning About Knowledge: Proceedings of the 
1986 Conference, pages 325-340, San Francisco, 1986. Morgan Kaufmann. 

12. H.J. Levesque. A logic of implicit and explicit belief. In Proceedings of the Fourth National 
Conference on Artificial Intelligence (AAAI '84), pages 198-202, 1984. 

13. R. C. Moore. Logic and Representations. Number 39 in CSLI Lecture Notes. CSLI Publica- 
tions, 1995. 

14. NASA. Proceedings of the Third Conference on CLIPS ( CLIPS' 94 ), Lyndon B. Johnson 
Space Center, September 1994. 

15. M. Nirkhe, S. Kraus, and D. Perlis. Thinking takes time: a modal active-logic for reasoning 
in time. Technical Report CS-TR-3249, University of Maryland, Department of Computer 
Science, 1994. 

16. R. Parikh. Knowledge and the problem of logical omniscience. In Methodologies for Intel- 
ligent Systems, Proceedings of the Second International Symposium, pages 432-439. North- 
Holland, 1987. 

17. A. S. Rao and M. P. Georgeff. Modeling rational agents within a BDI-architecture. In Pro- 
ceedings of the Second International Conference on Principles of Knowledge Representation 
and Reasoning (KR’91), pages 473-484, 1991. 

18. M. P. Singh. Know-How. In Michael Wooldridge and Anand Rao, editors. Foundations of 
Rational Agency, pages 81-104. Kluwer, Dordrecht, 1999. 

19. Software Technology Branch. Lyndon B. Johnson Space Center. Houston. CLIPS Reference 
Manual: Version 6.21, June 2003. 

20. W. van der Hoek, B. van Linder, and J-J. Ch. Meyer. An integrated modal approach to rational 
agents. In M. Wooldridge and A. Rao, editors, Foundations of Rational Agency, pages 133— 
168. Kluwer, Dordrecht, 1999. 

21. M. Whitsey. Timed reasoning logics: An example. In Proceedings of the Logic and Commu- 
nication in Multi-Agent Systems Workshop (LCMAS 2004). Loria, 2004. 




On the Relation Between ID-Logic and Answer 
Set Programming* 



Maarten Marien, David Gilis, and Marc Denecker 

Department of Computer Science, Katholieke Universiteit Leuven, Belgium 
{Maarten. Marien, David. Gilis , Marc . Denecker}@cs .kuleuven. ac.be 



Abstract. This paper is an analysis of two knowledge representation 
extensions of logic programming, namely Answer Set Programming and 
ID-Logic. Our aim is to compare both logics on the level of declarative 
reading, practical methodology and formal semantics. At the level of 
methodology, we put forward the thesis that in many (but not all) exist- 
ing applications of ASP, an ASP program is used to encode definitions 
and assertions, similar as in ID-Logic. We illustrate this thesis with an 
example and present a formal result that supports it, namely an equiva- 
lence preserving translation from a class of ID-Logic theories into ASP. 
This translation can be exploited also to use the current efficient ASP 
solvers to reason on ID-Logic theories and it has been used to implement 
a model generator for ID-Logic. 



1 Introduction 

This paper is a comparison of Answer Set Programming [9,12], more precisely, 
of General Logic Programming [8] or Stable Logic Programming [11], and ID- 
Logic [2,6]. Both logics can be considered as extensions of logic programming for 
knowledge representation. The basic formal result of this paper is an equivalence 
preserving translation from an important class of ID-Logic theories to ASP. This 
result leads to improved understanding of these logics in different ways. Not only 
does it give insight in the formal relationships between the logics, but it also 
leads to improved understanding of the methodology of ID-Logic and ASP and 
allows to compare them. Moreover, this result can be exploited to use the current 
generation of efficient ASP solvers to reason on or perform problem solving using 
ID-Logic theories. In fact, we discuss an existing model generator for ID-Logic 
which we built using this translation and the Smodels system. 

ID-Logic is an extension of classical first order logic that allows for a uniform 
representation of various forms of definitions, including non-inductive definitions, 
monotone inductive definitions (e.g. the transitive closure of a graph) and non- 
monotone forms of inductive definitions such as iterated induction and induction 
over well-founded posets (e.g. the standard definition of truth of a formula in a 

* Works supported by FWO-Vlaanderen, European Framework 5 Project WASP, and 
by GOA/2003/08. 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, pp. 108-120, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




On the Relation Between ID-Logic and Answer Set Programming 



109 



structure) . An ID-Logic theory consists of a set of FOL sentences, called asser- 
tions, and definitions. A definition is represented as a set of definitional rules of 
the form Vx(A a) where A is an atom, the so-called definitional implication 
(to be distinguished from material implication, c) and a a FOL formula (thus 
negation in ID-Logic is classical negation). A definition defines a set of defined 
predicates, namely those occurring in the head of rules, in terms of other open 
predicates, which appear only in the body of rules. Note that an ID-Logic theory 
never contains a definitional rule; it only may contain sets of such rules. The for- 
mal semantics is an integration of classical logic semantics and the well-founded 
semantics for definitions. Formally, Abcluctive Logic Programming under well- 
founded semantics can be seen as the subformalism of ID-Logic consisting of 
theories with only one definition and imposing the Domain Closure Axiom and 
the Unique Name Axioms. In [4], a extension called NMID-logic was proposed 
allowing for arbitrary boolean combinations of definitions and FOL formulas. 
The same paper explores the use of this logic for knowledge representation in 
the context of situation calculus. 

The second formalism that we consider here is General Logic Programming 
[8] or Stable Logic Programming as it was called in [11,12]. A program in this 
formalism consists of general program rules of the form A Body where A is 
an atom and Body is a conjunction of literals B or not B where B is an atom. 
The semantics is the stable model semantics. The formalism is a subformalism 
of Answer Set Programming (ASP), without the strong negation and the dis- 
junction in the head. Despite these limitations, most applications of ASP can be 
represented in it or in its extension with weight constraints [12]. The formalism 
is generally seen as a sublogic of default logic and negation as failure as a default 
negation operator “it is possible to assume that A is false ” . 

Both logics show considerable differences on the conceptual, syntactical and 
semantical level. Yet, if we compare examples and methodologies, striking sim- 
ilarities show up. To illustrate this, we take a representation of the wellknown 
notion of hamiltonian cycles of a graph. (See Figure 1, where we implicitly as- 
sume that the unique names axioms and the domain closure axioms hold.) 

There is an apparent similarity on the level of clauses and structure of the 
theory. In both theories, four different parts can be distinguished: 

— data , representing the graph by a set of atomic clauses in ASP or by two 
definitions of Vertex/ 1 and Edge/2 in ID-Logic. 

— ASP rules to open up predicates, here only the predicate in/2. In ASP, this 
can be done also using a disjunction (as in dlv) or using a weight constraint 
(as in Smodels). Often, such rules specify also a domain for the opened 
predicate. The domain of in/ 2 is the predicate edge/2 which means that 
in/2 is a subset of edge/ 1. As ID-Logic is an extension of classical logic, 
predicates are open per default; the domain declaration is formalised by an 
implication. 

— definitions, here only of the concept of reachable vertices (through hamilto- 
nian edges). 




110 



M. Marien, D. Gilis, and M. Denecker 



ID-Logic 



ASP (taken from [11]) 



V ertex(U) 



Edge(U, V ) * 

{ InitialVtx(U) <— } 



v ertex{u) 



edge(u, v ) 



initialvtx(u) 



in{Vl,V 2) <- 

edge(Vl, V2), not out{V 1,V2) 
owt(Vl, V2) •<— 

edge(Vl, V2), not m(Vl,V2) 



Vx,y(Reached(x) 

In(y,x) A InitialVtx(y)) 
Vx,y(Reached(x) <— 

Jn(i/, x) A Reached(y )) 



reached(V2) <— 

m(Vl, V2), reached(V 1) 
reached(V2 ) <— 

m(Vl, V2), initial vtx (VI) 



Vx,y{In(x,y) D Edge{x,y )) 

Vx, y, z((In(x, y) A 7n(x, z)) D (y 
Vx, y, z((In(y, x) A In(z, x)) D (y 
Vx(Vertex(x) D Reached(x)) 



f <- m(V2, VI), m(V3, FI), 
not V2 = V3, not / 

/ <- m(Vl, V2),m(Vl, V3), 
not V2 = V3, not / 

/ not reached(X), not / 



Fig. 1. Hamiltonian circuit 



— assertions , (called constraints in ASP) representing the basic properties of 
hamiltonian cycles. 

Note that negation as failure in the ASP program corresponds to classical nega- 
tion in the ID-Logic theory. This shows that in this type of ASP programs, 
negation as failure is to be interpreted as classical negation. 

The role of definitional and assertional knowledge for knowledge representa- 
tion has long been recognised in AI [18,1] and was the motivation for Descrip- 
tion logics. The distinction between both sorts of knowledge is a fundamental 
one which sheds light on certain aspects of ASP methodology that are otherwise 
hard to explain. For example, we need to express that in a hamiltonian cycle, 
all vertices are reachable. In ID-Logic this property is expressed by the FOL 
axiom \/x(Vertex(x) D Reached(x)). In the ASP program, this is expressed by 
the constraint f not reached(X), not f. Consider an alternative represen- 




On the Relation Between ID-Logic and Answer Set Programming 



111 



tation by the rule reached(X) vertex(X). If we use this representation, we 
actually get models in which no edge belongs to the lramiltonian cycle. How can 
we explain this? The reason is that ASP interprets reached(X) vertex(X) as 
an additional definitional rule, while in fact, it represents assertional knowledge. 
The only correct way to represent such knowledge in ASP, is by constraints such 
as f i— not reached(X), not /. 

In our experience, the pattern of four parts consisting of data, declarations 
of open predicates, definitions and assertions, can be found in most applications 
of Stable Logic Programming. Other examples can be found in [13,14], in which 
different LP-approaches to KR are compared. The thesis that we want to launch 
in this paper, is that SLP can be interpreted as a language for representing 
definitions and assertions and moreover that this explains most applications and 
the methodology that is commonly used in ASP. 

Of course, this thesis cannot be formalised or formally proven. However, 
the rest of this paper is concerned with a formal translation from ID-Logic with 
Unique Names Axioms and the Domain Closure Axiom to general logic programs 
which provides strong support for the thesis. The translation sheds light on how 
assertions and definitions are implicitly encoded in ASP. The main problem for 
proving the correctness of this translation is the use of well-founded semantics 
in ID-Logic versus stable semantics in ASP: basically we will show that the non- 
determinism of multiple stable models of many ASP programs derives from the 
combination of open predicates with deterministic definitions on top of them. 

As a final remark, we do not claim that all ASP programs can be understood 
as encodings of ID-Logic theories. Certain ASP programs should be understood 
as autoepistemic theories or default theories, and not as ID-Logic theories. An 
example taken from [7] is the following rule: 

check status (X) person(X), not orphan(X), not -^orphan(X) 

The intended declarative reading of this rule is that the status of a person should 
be checked if it is unknown whether it is an orphan or not. In this rule, the 
negation as failure has indeed a non-objective modality; such a modality is not 
available in ID-Logic. What this also shows is that in different applications and 
subsets of ASP, negation as failure and the rule operator have different meanings. 
This ambiguity is investigated in [4]. 

2 Preliminaries 

2.1 Logic Programs 

We first introduce some terminology and basic concepts. A vocabulary r is a set 
of constant, function, predicate and variable symbols. The Herbrand universe 
of r, consisting of all terms of r is denoted T-LU{t). A Herbrand interpretation 
of r is a set of ground atoms of r, containing all atoms that are true. A 3- 
valued Herbrand interpretation will be defined here as a pair (/, J) of Herbrand 




112 



M. Marien, D. Gilis, and M. Denecker 



interpretations such that / C J. Intuitively, an atom A is true in (I, J) if it is true 
in /, it is false in (/, J) if it is false in J and otherwise, it is undefined in (/, J). 
A pair (J, J) is viewed here as a tuple of underestimate / and overestimate J. 

A general logic program (in r) is a set of clauses of the form A A- A\ A. . .AA,:A 
not A . . . A not A n , with A, Ai all atoms (in r). We allow infinite programs 
and infinitary rules (i.e. i and n can be infinite). A definite logic program P is a 
general logic program without negative literals. A definite logic program has a 
least Herbrancl model denoted C'H.M(P). 

The grounding of a general logic program P is defined as usual, as the set of 
all rules that can be obtained by instantiating variables in rules of P by ground 
terms. Also usual, the grounding is seen as a propositional logic program. 

We recall the stable semantics [8] and well-founded semantics [17] of general 
logic programs. As usual, we define these semantics for propositional programs 
only. Semantics of predicate programs are defined through their grounding. 

Given a general logic program P in r and a Herbrand interpretation I. The 
reduct P/ is the program obtained from P by deleting 

— each rule that has a negative literal not qi , qi € I in the body 

— all negative literals in the body of the remaining rules. 

Since Pi is a definite logic program, C'HM(Pi) exists. 

We define the Gelfond-Lifschitz operator GL P associated to program P as 
the operator of Herbrand interpretations which maps an interpretation / to 

cnM(P r ). 

A stable model of P is defined as a fixpoint of GLp, i.e. as a Herbrand 
interpretation I such that I = C'H.M(Pi). 

To define the notion of well-founded model of P, we follow the approach of 
[3]. The operator GLp is antimonotone, i.e. if I C J then GLp(I) D GLp(J). 
As a consequence, GL 2 P is a monotone lattice operator and has a least fixpoint 
lfp(GL 2 P ) and a largest fixpoint gfp(GLp). The pair ( lfp{GL 2 P ),gfp(GL\ >)) is 
the well-founded model of P. It holds that for each stable model I, l fp(GL 2 p ) C 
I C gfp(GL 2 p ). 

2.2 ID-Logic 

As mentioned in the introduction, an ID-Logic theory T in r is a set of FOL 
sentences and definitions. Each definition D is a set of rules of the form Vx(A •<— 
a) where A is an atom and a a FOL formula. Each definition D has a set 
Def(D) of defined predicates, i.e., those appearing in the head of a rule. (The 
defined predicates may explicitly be mentioned in front of the rule set, as in 
P/n, Q/m {. . .}, which simultaneously defines the predicate P with arity n, 
and the predicate <5 with arity m. The empty definition of a predicate P/n can 
be represented by P/n ::= {}.) The set r \ Def(D) is called the set of open 
symbols of D and is denoted Open(D). 

The semantics of ID-Logic is based on an extension of the well-founded se- 
mantics to arbitrary (non-Her brand) interpretations. This extension associates 




On the Relation Between ID-Logic and Answer Set Programming 



113 



with each definition D and an arbitrary interpretation I Q of Open(D) a unique 
(possibly 3- valued) well-founded model iff, called the well-founded model of 
D extending I 0 ■ A r-interpretation / is a model of D if I is two- valued and 
I = [I\open{D)) D ■ Here, I\open(D) denotes the restriction of I to the open sym- 
bols of D. Formally, an interpretation I is a model of an ID-Logic theory iff it 
is a model of each of its FOL sentences and a model of each of its definitions D. 
For details we refer to [2,6]. 

Example 1. Consider the ID-Logic definition {P Q}. There are two interpre- 
tations of the open predicate Q of this definition: one where Q = t, one where 
Q = / . The corresponding well-founded models of the definition are resp. {P, Q} 
and 0. By a symmetric argument, these interpretations are also the two models 
of the definition {Q -s— P}. Consequently, the theory T\ = [{P Q}, {Q P}] 

has models {P, Q} and 0. 

On the other hand, the theory P 2 = [{P Q, Q P}] has only one model: 
0. The definition has no open predicates and its well-founded model is 0. 

From a knowledge representation perspective, the use of Herbrand interpre- 
tations boils down to the use of the Domain Closure Axiom and the Unique 
Name Axioms. Those are not imposed by FOL nor by ID-Logic but they can be 
expressed in ID-Logic [6]. However, in the context of this paper we will only con- 
sider ID-Logic theories which (implicitly) contain these axioms. So, all models 
are Herbrand models (modulo isomorphism). 

A crucial notion is that of totality: a definition D is total in an interpretation 
I 0 of Open(D) iff the well-founded model iff of D extending I Q is 2-valued. A 
definition D is total in a theory T a in Open(D) if D is total in each model of T 0 . 
A definition D is total if it is total in the empty theory, that is if D is total in 
each interpretation I Q of Open(D). Intuitively, a definition D is total in I 0 if the 
definition allows to determine the truth values of all the defined atoms in the 
context of I 0 . 

3 The Transformation 

We first present a formal transformation for a restricted subclass of ID-Logic 
theories which comprises the example of Section 1. In the next subsection, we 
extend the transformation to more general cases. 

We will use the following notion of equivalence. Let ti, t 2 be vocabularies 
extending r, and T\ and T 2 theories in respectively ti, t 2 ; then Ti and P 2 are 
equivalent in r (denoted Ti = T T 2 ) if for each ri-model M\ of T\, there exists a 
r 2 -model M 2 of T 2 such that Mf\ T = M 2 \ T and vice versa. The theories T\ and 
P 2 do not necessarily belong to the same logic, e.g., T\ might be an ID-Logic 
theory and P 2 a stable logic program. 

3.1 A First Transformation 

The class of ID-Logic theories T considered here in this section have a similar 
structure as the example of Section 1. They consist of the following components: 




114 M. Marien, D. Gilis, and M. Denecker 

— definitions Dp/ m to represent data, defining certain predicates by exhaustive 
enumeration. Such definitions consist of ground atomic rules; 

— one domain declaration Vx(P( x) D Cp(x)) for each predicate P/n, open in 
all definitions of T\ Cp is a conjunction of literals and will be called the 
domain of P/n. An example is the FOL axiom Vx, y(In(x , y) D Edge(x, y)); 

— a set of definitions D 1 , . . . , D. n defining other concepts; the bodies of all rules 
are conjunctions of literals; 

— a set of FOL formulas in the clausal form V(Ai A . . . A n D B\ V ... V B m ), 
where A,;, Bj are atoms. 

In addition, the definitions of T should satisfy some other conditions. To express 
these conditions, we need the following concept. The dependency relation P/n < 
Q/m of T is the least transitive relation between predicate symbols containing 
all pairs (P/n, Q/m) such that Q/m appears in the head and P/n in the body 
of some rule of some definition D G T. We call an ID-Logic theory T stratified 
if each predicate is defined in at most one definition of T and any two predicates 
P/n and Q/m are defined in the same definition of T whenever P/n < Q/m and 
Q/m < P/n. Given this concept, T should also satisfy the following conditions: 

— T is a stratified ID-Logic theory; 

— each definition D G T is total. 

A theory T satisfying the above conditions can be easily transformed into a 
stable logic program Pp. We describe this transformation in two steps: 

— The first step transforms the ID-Logic theory T in an ID-Logic theory T' = 
T a U {Dp}, where T a consists of all FOL axioms of T and Dp = [J DeT D, 
with Def(Dp) = \J DeT Def(D). So, all definitions are merged. Notice that 
this theory is formally an abductive logic program under the well-founded 
semantics (where the abducible predicates correspond to Open(Dp)) . 

— In the second step we replace -i by not, and A by in the definition Dp. 
Also, we switch case of constant, functor and variable symbols. Using the 
method of Satoh and Iwayama [15] to transform an abductive logic program 
into a stable logic program, we then interpret Dp as a set of ASP rules, and 
add to this set the rules 

P(X) <- C P (X), not P*(X), 

P*(X) <- Cp(X), not P(X) 

for each predicate P/n G Open(Dp). Finally, for each clause V(Ai A . . . A n D 
Bi V ... V B m ) G T a , we add one rule to the set: 

f <- A,... ,A n , not B i, . . . , not B m , not /. 



Theorem 1. An Herbrand interpretation M is a model ofT if and only if there 
is a stable model M' of Pp such that M = M'\ T . 




On the Relation Between ID-Logic and Answer Set Programming 



115 



Proof, (sketch) The theory T satisfies the conditions of the modularity theorem 
of [5]. As a consequence, the ID-Logic theories T an T' are equivalent and the 
definition Dt is total. The ID-Logic theory T' is formally an abductive logic 
program under the well-founded semantics. Moreover, because of the totality of 
Dt , all well-founded models are 2-valued, and hence the well-founded and stable 
semantics coincide. The correctness of the last step of the transformation (i.e. 
from an abductive logic program under the stable semantics to a stable logic 
program) was proven in [15]. 

Example 2. The ID-Logic theory of section 1 satisfies the conditions speci- 
fied in this section. The only non-trivial condition is the totality of the def- 
inition of Reached. This definition is a monotone definition (no negative lit- 
erals with defined predicates in the body of rules) and such definitions are 
total [5]. The only difference between the translation of this theory and the 
ASP theory from Fig. 1, is that our translation uses the constraint “/ «— 
Vertex(x), not Reached(x ) , not /”, while the original ASP theory uses “/ <— 
not reached(X), not /” instead. 

3.2 Extending the Transformation 

This section presents an extension of the transformation to a broader class of 
ID-Logic theories, by providing three separately applicable transformations to 
theories from the class considered in Section 3.1: 

— a transformation from non-stratifiable to stratifiable ID-Logic theories; 

— a transformation from arbitrary FOL axioms to clausal axioms; 

— a transformation from rules with FOL formulas in the body to conjunctive 
rules. 



Non-stratifiable theories. The previous section imposed that no predicate 
should be defined in more than one definition, and if predicates P/n and Q/m 
depend on each other, then they are defined in the same definition. If these 
conditions are not satisfied, then merging together all definitions into one is not 
equivalence preserving. Example 1 already illustrated this for a non-stratified 
theory (the theories [{P Q}, {Q 1— P}] and [{P Q, Q 4— P}] are not equiv- 

alent); the next example shows the problem with multiple definitions for the same 
concept. 

Example 3. Consider the ID-Logic theory 



{Va :(Person(x) 4— Man(x) V Woman(x))} , 
{\/x(Person( x) 4— Child(x) V Adult(x))} 



This theory contains two definitions for the predicate Person. These definitions 
constrain each other; for example, T logically entails the formula \/x(Man(x) V 
Woman(x) = Child(x) V Adult(x)). If the definitions are merged, this formula 
is not longer entailed. 




116 



M. Marien, D. Gilis, and M. Denecker 



The problems shown are easy to avoid by renaming the defined predicates 
before merging. We create a new ID-Logic theory T ^ consisting of the following 
parts: 

— for each definition D € T, a definition D' obtained from D by replacing any 
occurrence of P £ Def(D) by P D , where P D is a new predicate; 

— all assertions of T ; 

— formulas Vx(P(x) = P D (x)) for each definition D and predicate P defined 
in D. 



Theorem 2. The theory T ^ is a stratified theory. It holds that T = T T W 
(where r is T’s vocabulary) . Moreover if each definition of T is total, then each 
definition ofT W is total. 



Example f. The theory T\ from example 1 is transformed by this first step into 




{P Dl ^Q} 
{Q° 2 <- P} 



P Dl = P 
Q° 2 = Q 



Transforming FOL formulas in clausal form. The standard transformation 
of FOL formulas into clausal form cannot be used in this context. The reason 
is that we assume the Unique Names Axioms while the transformation to CNF 
introduces skolem function symbols to which the Unique Names Axioms do not 
apply. Instead, the following variant transformation can be used. Each FOL 
formula can be brought in the form: 

V(Pi V . . . V F n V Gi V . . . V G m ) 

where each Pj is a literal and each Gi is an existentially quantified formula 3xHi . 

Let Xi be all free variables of G,;. We introduce for each Gj, 1 < j < m a 
new predicate P, /n 7 ; where rq is the number of variables in Xi, and translate the 
above FOL formula in: 



V(Fl V ... V F n V V ... V P m {x n )) 



combined with definitions 



{VxiiPiixi) <- G z )} 



Let T be an abritrary ID-Logic theory, and T ^ be the ID-Logic theory 
obtained by applying the above transformation. 

Theorem 3. It holds that T = T where r is the vocabulary of T (without 
the new symbols Pi/ Hi). 




On the Relation Between ID-Logic and Answer Set Programming 



117 



Creating conjunctive bodies. We now discuss how to transform definitional 
rules with FOL bodies to rules with conjunctions of literals in the body. The 
transformation is basically the one proposed by Lloyd and Topor [10], with the 
exception of the rule for removing universal quantifiers in bodies. Rules of the 
form 

V(P <-FA (MxV{x)) A G) 

are transformed by Lloyd and Topor into a pair of rules 

V(ff<-FAnP' y m ) AG) 

\/(P'(y 1 ,...,y m ) <- ~^V{x)) 

where yi,. ■■ ,y m are the free variables of VxV(x) and P' is a new predicate. 
However, if a predicate depending on H occurs in V ( x ) , this transformation is 
not equivalence preserving in general. Instead, we replace this rule with 

V(ff<-FA V'(C'i) A V(C 2 ) A ... A G) . 

where C\, C 2 , ■ ■ ■ are all terms in the Herbrand universe. Of course, this trans- 
formation rule may produce infinitary rules in case the Herbrand Universe is 
infinite. 

The set of all rewrite rules are presented in Figure 2, where F and G denote 
arbitrary FOL formulae. 

By applying the above rewrite rules on an arbitrary ID-Logic-definition D 
until none is applicable anymore, we obtain a definition D' such that all bodies 
are conjunctions of literals. The following theorem holds. 

Theorem 4. The definitions D and D' are logically equivalent. Moreover, if D 
is total then so is D' . 

At this point, we can transform any ID-Logic-theory T into an equivalent ID- 
Logic-tlreory T ^ which is stratified and such that all bodies of definitional rules 
are conjunctions of literals. More precisely, it holds that T = T Moreover, if 
all definitions of T are total, then all definitions of are total. Consequently, 
the conditions of theorem 1 hold and T l 3 ' 1 can be transformed into an equivalent 
stable logic program P T{3 ). We find that T = T P T (3). 

4 A Model Generator for ID-Logic 

The above transformation, apart from showing the commonalities between 
methodology of ASP and of ID-Logic, also provides us with an effective means to 
compute models for a subclass of ID-Logic theories with total definitions: we first 
translate them, and then apply a stable model generator, such as lparse/Smodels, 
to the translation. 

Since the general tranformation may produce infinitary rules, and/or rules 
which have to be grounded w.r.t. TUAIt ), we further restrict the class of ID-Logic 
theories. 




118 



M. Marien, D. Gilis, and M. Denecker 



replace 


by 




Vi (H «- F A (37/1 • • • y m V) A G) 


Vi, i/I,.. 


. 2/m : (R F 1 A F A G) 


\/x(H<-F A -.(3yi . . . y m V) A G) 


Vi (H <- 


F 1 A V//i . . . y m ^V A G) 


Vi (H «- F A -.(Vyi . . . y m V) A G ) 


Vi, 2/i,.. 


. 2/m (R F 1 A -iF A G) 


\/x(H <- FA(V = W) AG) 


Vi(R 


F 1 A ((F A IF) V (-.F A -.IF)) A G) 


\/x(H <- F A (V C W) A G) 


Vi(R <- 
Mx(H <r- 


FAV AG) 

F A -.IF A G) 


Vx{H <r- F A ->{y C W) A G) 


Vx(H <r- 


FAWA1LAG) 


\/x{H eFA(fV W) A G) 


Vx(H <- 
Vi(R <- 


FAV AG) 
F AW AG) 


V*(H <— .F A -i(V" Vlf)AG) 


Vi(R eFATAnlfAG) 


V*(H e- F A -i(V" Alf)AG) 


Mx(H <r- F A -.F A G) 
Vi(R <- F 1 A -iW A G) 


V*(H F 1 A -.-iF A G) 


Vi(R <- 


FAV AG) 



Fig. 2. Lloyd- Topor transformations for obtaining conjunctive bodies 



Definition 1 (Restricted definitions). We define that a definition D is re- 
stricted in an ID-Logic theory T, by the following inductive rules: 

— D is a definition by exhaustive enumeration (of ground facts), or 

— D satisfies the following conditions: 

• all bodies are conjunctions of literals, and each predicate of the body is 
defined in a restricted definition in T , 

• each variable occuring in the head of a rule also occurs in a positive 
literal of the body of this rule, and 

• there is no other definition in T defining predicates defined in D 

We denote the predicates of an ID-Logic theory T defined in a restricted 
definition with Restricted(T) . If the definitions of restricted predicates are finite, 
the extension of each restricted predicate is finite and can be computed easily. 
Consequently, each conjunction R of atoms from Restricted(T) has a finite and 
computable extension. 

To avoid the infinitary rules which are produced in the presence of recursion 
over V, we restrict the use of V to formulas of the form “\/x(R( x) D F(x))” , 
where R is a conjunction of restricted atoms, and all free variables of F(x) occur 
in R(x). A rule “Vj/(iF <-fA \/x(R(x) D F(x)) A G)” can be transformed into 
F AF(Gi) A . . .A F(C n ) A G), where C \, . . . , C n are all elements of the 
extension of R. (Of course, we only do so if F contains a predicate depending 
on H , otherwise we can just use Lloyd- Topor’s transformation.) 




On the Relation Between ID-Logic and Answer Set Programming 



119 



Definition 2 (Strongly range-restricted ID-Logic theory). An ID-Logic 
theory T is strongly range-restricted if the following conditions hold (below, R(x) 
denotes a conjunction of restricted atoms): 

— any definition in it is total; 

— all quantifiers in assertions and bodies of rules are restricted to formulas of 
the form \/x(R(x) D F(x)) and 3 x(R(x) A F(x)); 

— for each open predicate P there is one domain declaration Vx(P(x) D R(x)); 

— each definitional rule is of the form Vx(P(x) 4— R(x) A F); 

Theorem 5. A strongly range-restricted ID-Logic theory is transformed by our 
transformation to a strongly range-restricted [16] general logic program. 

Note that lparse/Smodels requires strongly range-restricted programs. There- 
fore, by requiring strongly range-restrictedness of our ID-Logic theories, we can 
calculate ID-Logic models using lparse/Smodels. 

We have devised an implementation of our transformation, which, when ap- 
plied on theories from this class, and combined with lparse/Smodels, computes 
ID-Logic models. 

5 Conclusions 

We presented a general tranformation from ID-Logic theories to stable logic 
programs. This transformation illustrates the fundamental distinction between 
definitional and assertional knowledge and shows how these can be encoded 
in Stable Logic Programming. We believe our transformation truthfully corre- 
sponds to the way many ASP programs in many applications are developed. 
This way, the transformation sheds light on the methodologies of ID-Logic and 
ASP. 

Also, the transformation has enabled us to create an ID-Logic model genera- 
tor, by applying our transformation on an ID-Logic theory, and using an existing 
ASP model generator to find the models of the resulting program. 

References 

1. R. J. Brachman and H.J. Levesque. Competence in Knowledge Representation. In 
Proc. of the National Conference on Artificial Intelligence , pages 189-192, 1982. 

2. M. Denecker. Extending classical logic with inductive definitions. In J. Lloyd et al., 
editor, First International Conference on Computational Logic (CL2000), volume 
1861 of Lecture Notes in Artificial Intelligence, Springer, pages 703-717, 2000. 

3. M. Denecker, V.W. Marek and M. Truszczynski. Approximating operators, stable 
operators, well-founded fixpoints and applications in non-monotonic reasoning. In 
Logic-based Artificial Intelligence, (J. Minker ed.), Kluwer Academic Publishers, 
pages 127-144, 2000. 

4. M. Denecker. What’s in a model? Epistemological analysis of Logic Programming. 
In Proceedings of Ninth International Conference on Principles of Knowledge Rep- 
resentation and Reasoning, Delta Whistler Resort, Canada, 2004. 




120 



M. Marien, D. Gilis, and M. Denecker 



5. M. Denecker and E. Ternovska. A logic of non-monotone inductive definitions and 
its modularity properties. In Logic Programming and Nonmonotonic Reasoning: 
7th International Conference (V. Lifschitz and I. Niemela, eds.), vol 2923, Lecture 
Notes in Computer Science, pages 47-60, 2004 

6. M. Denecker and E. Ternovska. Inductive Situation Calculus. In Proceedings of 
Ninth International Conference on Principles of Knowledge Representation and 
Reasoning, Delta Whistler Resort, Canada, 2004. 

7. M. Gelfond. Representing knowledge in A-Prolog. In A. Kakas and F. Sadri, 
editors, Computational Logic: Logic Programming and Beyond; Essays in honour 
of Robert A. Kowalski, Part II, number 2407 in Lecture Notes in Computer Science, 
pages 413-451. Springer Verlag, 2002. 

8. M. Gelfond and V. Lifschitz. Logic Programs with Classical Negation. In D.H.D. 
Warren and P. Szeredi, editors, Proc. of the 7th International Conference on Logic 
Programming 90, page 579. MIT Press, 1990. 

9. M. Gelfond and V. Lifschitz. Classical negation in logic programs and disjunctive 
databases. New Generation Computing, pages 365-387, 1991. 

10. J. W. Lloyd and R. W. Topor. Making Prolog more Expressive. Journal of Logic 
Programming, 3:225-240, 1984. 

11. V.W. Marek and M. Truszczynski. Stable models and an alternative logic pro- 
gramming paradigm. In K.R. Apt, V. Marek, M. Truszczynski, and D.S. Warren, 
editors, The Logic Programming Paradigm: a 25 Years Perspective, pages 375-398. 
Springer- Verlag, 1999. 

12. I. Niemela. Logic programs with stable model semantics as a constraint program- 
ming paradigm. Annals of Mathematics and Artificial Intelligence, 25(3,4) :241 273, 
1999. 

13. N. Pelov, E. De Mot, and M. Denecker. Logic programming approaches for repre- 
senting and solving constraint satisfaction problems: a comparison. Proceedings of 
the 7th International Conference on Logic for Programming and Automated Rea- 
soning, (M. Parigot and A. Voronkov, eds.), vol 1955, Lecture Notes in Artificial 
Intelligence, pages 225-239, 2000 

14. N. Pelov, E. De Mot, and M. Bruynooghe. A comparison of logic programming 
approaches for representation and solving of constraint satisfaction problems. Pro- 
ceedings of the 8th International Workshop on Nonmonotonic Reasoning, (C. Baral 
and M. Truszczynsky, eds.), pages 1-10, 2000. 

15. K. Satoh and N. Iwayama. Computing Abduction by Using the TMS. In Proceed- 
ings of ICLP’91, pages 505-518, 1991. 

16. T. Syrjanen. Implementation of Local Grounding for Logic Programs with Stable 
Model Semantics. Helsinki University of Technology, Technical Report, 1998 

17. Allen Van Gelder, Kenneth A. Ross, John S. Schlipf. The Well-Founded Semantics 
for General Logic Programs. Journal of the ACM 38(3):620-650, 1991. 

18. W. Woods. What’s in a Link: Foundations for Semantic Networks. In D. Bobrow 
and A. Collins, editors, Representation and understanding: Studies in cognitive sci- 
ence. Academic Press, New York, 1975. Also in Brachman and Levesque, Readings 
in Knowledge Representation, Morgan Kaufman, 1985. 




An Implementation of Statistical Default Logic 



Gregory R. Wheeler and Carlos Damasio 



Centro de Inteligencia Artificial (CENTRIA) 
Departamento de Informatica, Universidade Nova de Lisboa 
2829-516 Caparica, Portugal 
{greg, cd}@di . f ct .uni .pt 



Abstract. Statistical Default Logic (SDL) is an expansion of classical 
(i.e., Reiter) default logic that allows us to model common inference 
patterns found in standard inferential statistics, e.g., hypothesis testing 
and the estimation of a population's mean, variance and proportions. 
This paper presents an embedding of an important subset of SDL theo- 
ries, called literal statistical default theories, into stable model semantics. 
The embedding is designed to compute the signature set of literals that 
uniquely distinguishes each extension on a statistical default theory at a 
pre-assigned error-bound probability. 



1 Introduction 

Standard statistical inference is non-monotonic. Parameters of a target popula- 
tion may be estimated by measures taken on a sample that, after testing for bias, 
serve as a defeasible estimate of the population’s corresponding parameters. For 
example, we may estimate the age of a population by identifying the mean age 
of a representative sample drawn from the population. However, classifying a 
sample as representative is not straightforward since knowing that a sample is 
representative is to be in the position of not needing to use inferential statistics. 

The fit between a statistic and a target parameter is defeasible because a 
sample, however carefully selected, may fail to be representative of the target 
population. Consider the estimation of a population’s mean age. Textbooks ad- 
vise that drawing a sample at random is a good procedure for selecting repre- 
sentative samples [2], [12], [8]. But of course drawing a sample at random does 
not guarantee that it is representative. Suppose a random sample selects only 
subscribers to Rolling Stone, a magazine covering popular culture catering to 
young adults. Suppose also that the population whose age we are interested in 
estimating is of a particular medium-sized city. Our background knowledge con- 
cerning the constitution of cities would make us suspect that the sample we’ve 
drawn does not give us a close estimate of the city’s mean age even though the 
sample was drawn at random. 

In [7] it was shown that key assumptions employed in standard inferential 
statistical practice, such as the random sampling assumption, actually function 
like default justifications. In [16] an expanded default logic, called statistical de- 
fault logic, was introduced to capture the defeasible structure of basic statistical 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, pp. 121-133, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




122 



G.R. Wheeler and C. Damasio 



inference. The resulting logic provides a knowledge representation framework 
for representing standard statistical argument forms and sequences composed of 
statistical and deductive inference steps. 1 

In this paper we present an embedding of an important fragment of statistical 
logic into answer-set programming. The structure of the paper is as follows. First 
we will present a brief motivation for statistical default logic from a knowledge 
representation point of view, highlighting the structural similarity between a 
standard statistical inference and statistical default inference forms. Next we will 
present an example of a statistical default extension. (Refer to [16] for details.) 
We then present an embedding of a fragment of statistical default logic into 
answer-set programming. This embedding faithfully captures the central and 
new notion in statistical default logic, namely that of terminating admissible 
inference sequences at a specified threshold level. Finally, we highlight the novelty 
of these results by comparing them to existing probabilistic logic programming 
frameworks. 



2 Representing Statistical Inference Within Statistical 
Default Logic 

We assume here familiarity with classical default logic [15]. Statistical default 
logic [16] extends classical default logic by associating with each element in a 
default theory, both formulae from a propositional language and defaults, a real 
number 0 < e < 1 called an error-bound parameter. 

A statistical default is an inference form that explicitly acknowledges the 
upper limit of the probability of applying that default rule and accepting a false 
statement. 2 

Definition 1. A statistical default is an ordered pair consisting of a classical 
propositional default in the first coordinate an error bound parameter e in the 
second coordinate, displayed as 



ex : Pi , ..., p n . 

y (i) 

Expression (1) is called an e-bounded statistical default (s-default, for short), 
where e expresses the upper limit on the probability of applying (1) and accepting 
that 7 is true when 7 is false. We say that the error-parameter e is an e-bound 
for the s-default displayed in expression (1). 

The logic also replaces sentences in the propositional language with sentence- 
e pairs, called bounded sentences. 

1 Representing statistical argument forms by defaults is distinct from [1], which studied 
the representation of statistical statements rather than statistical inference. 

2 A trivial corollary of the probability of error a for a statistical inference is the upper 
limit of the probability of error, denoted by t. So, if a = 0.03 is understood to mean 
that the probability of committing a Type I error is 0.03, then e = 0.03 is understood 
to mean that the probability of committing a Type I error is no more than 0.03. 




Ail Implementation of Statistical Default Logic 



123 



Definition 2. Bounded sentence: A sentence <j> bounded by e is an ordered pair 
(<j>, e), written {(j>) e for short, where <j> is a sentence in the propositional language 
C and e £ [0, 1]. ((f>) e = <t>, if e = 0. 

Whereas a classical default theory A = ( F , D) consists of a set F of first- 
order formulae and a countable set D of defaults, a statistical default theory 
A s = (W, S) is defined as a pair consisting of a set W of bounded sentences and 
a set S of statistical defaults. 

Note that a Reiter default is a special case of an s-default, namely when 
e = 0 and classical default logic is a special case of statistical default logic, 
namely when the e-bound of every bounded sentence and every s-default is zero. 
We refer readers to [16] for the main results of statistical default logic. 

Following [7], we demonstrate how to use an s-default to represent the key 
structural features of an inference of the mean age of a population, X. This 
problem is an instance of an inference of the mean of a normal distribution 
when the standard deviation is known. Suppose we draw a sample s on X and 
calculate the mean age of s, s = 24 years. It is reasonable for us to infer that 
the mean age of X is in the interval 288 months (24 years) ±1.96er, where o is 
the standard deviation of age in months derived from the cardinality of s. Given 
the s-default rule schemata (a : pi, ..., P n /[ e ]'y), we may suppose that 

a : The calculated mean age of s is 288 months A Measurement errors 
are distributed normally with mean zero and variance er 2 . 

7 : The age of X is within two standard deviations of 288 months. 

Pi : This is the only statistic we have for X. 

P 2 : There is no prior statistical knowledge of the distribution of age 
in the class that s belongs to that would lead to a conflicting inference. 

p 3 : There is no information concerning the condition of the sample 
that preempts the information provided by the calculation of s. 

e = 0.05. 

Notice that we could collect additional statistics of the age of X and undermine 
the conclusion drawn from this rule. Surely if we have two statistics, we should 
use a distribution for the average of the two values (in most cases) and that uses 
a smaller variance. 

Whether this, or one of the other justifications Pi,...,Ps is triggered does not 
undermine the prerequisite. It remains the case that the calculated mean age 
of s is 288 months and that the distribution of errors is normal, with a mean 
of zero and its characteristic variance. It is the consequent, the conclusion that 
claims that the mean age of the population X is 288 months ±2er months, that 
is blocked. Notice that it is blocked when we have additional not necessarily 
non-contrary information. 

Justification P 2 says that if there is prior statistical information regarding the 
mean age of X, then that information should take precedence over any conclusion 
drawn from the measurement report. For instance, if we are dealing with a popu- 
lation with known descriptive statistics (e.g., given by a census), this knowledge 




124 



G.R. Wheeler and C. Damasio 



should be taken account of: we typically would not infer that the estimate based 
upon s supersedes the census description of X , for suitably small populations 
not affected by data recording errors. If we already have knowledge of the age 
of X this knowledge should block the application of this particular default rule. 

The last default, /%, concerns general conditions that should be in place to get 
a good estimation of the population’s mean age. For instance, if the sampling 
procedure is carried out from a direct-mail advertiser’s database, we should 
ensure that the database is not biased with respect to age. We don’t accept this 
as an explicit assumption, since s belongs to infinite reference classes. Rather, if 
we know that s is a member of a biased class with respect to age — such as readers 
of Rolling Stone — we have grounds to block the application of the default. The 
point isn’t that knowing all members of s are Rolling Stone readers entails that 
s fails to be representative, but that knowing that s is drawn exclusively from 
the class of Rolling Stone readers is sufficient to doubt that the statistical model 
fits — that is, there is reason to doubt that s is an estimate of X within two 
standard deviations of the true mean age of the population. 

3 Statistical Default Extensions 

Extensions for statistical default logic are constructed in the usual way, except 
that the operator ‘terminates’ when inference reaches a specified threshold and 
a function Crop() is called on the resulting set of bounded sentences, returning 
the set of wffs without their corresponding e-bound. For details the reader is 
referred to [16]. 

Consider the following two examples. 

Example 1. Let A\ = (W, S i) be a statistical default theory, where W = 0 and 
Si contains four s-defaults: 

S 1 = j^fO.Ol, ^g0.01,^g^0.0l / A ^T c 0.0l| 

For an error-bound parameter e\ — 0.02, there is one statistical default ex- 
tension II 1 where Crop(II l ) contains 

a,b,aab,c. 

The bounded sentence A at ca is included in extension 77 1 by applying the 
default ^ and bounded sentence B at e_g is included by applying the default 

where each inference has an error bound of 0.01, so (-A)o.oi and (-B)o.oi- 
{A A B) tAAB is included in the extension, since the sum of the error bounds of 
conjoining A and B is 0.02, that is (A A B) 0 02 . The bounded sentence C at ec 
is included by using A, whose error bound is 0.01, to apply the default A: ^’ C , 
whose error bound is also 0.01. Hence (C)o.o 2 - The default Aa ^7 c cannot be 
applied because the resulting conclusion -■ C would have an error bound of 0.03, 
( _, C)o .03 which is above the designated threshold ei = 0.02. 

For a threshold parameter e 2 = 0.03, there are two statistical default ex- 
tensions: n 1 , which is the same as described above, and 77 2 , where Crop(II 2 ) 
contains 




Ail Implementation of Statistical Default Logic 



125 



A,B,AA B, ->c. 

The default rule that could not be applied before is now applicable with respect 
to e 2 , giving rise to the second extension il 2 . 3 

Example 2. Now let A 2 S = (IT, S 2 ) be a statistical default theory, where W = 0 
and S 2 contains six s-defaults: 

5 2 = {i§§T.OO, §0.02, §§0.01, ^§0.03, ^f^O.Ol, 0.0l| 

For an error-bound parameter e\ = 0.02, there is no statistical default extension, 
since while both ^’‘" O.OO, §0.02 yield C only the bounded sentence (C, 0.00) 
from —§—0.00 may be substituted for the antecedent of §§0.01 which in turn 
is applicable in extensions consistent with B. But ' c : 0.00 is applicable only 
in extensions consistent with -<B. 

For an error-bound parameter e 2 = 0.03, there are three extensions. We will 
continue the convention of example 1 of distinguishing them by focusing on the 
literals of each extension; this will also serve our purposes in the remainder of 
the paper. However, because this example highlights the role that error-bounds 
play in constructing extensions we will display the extensions first in uncropped 
form, then in cropped form. 

i7i D {(C, 0.00), (C, 0.02), (->!?, 0.03), (A, 0.01)} 
n 2 2 {(C, 0.00), (C, 0.02), 0.03), (-u4, 0.01)} 

n 3 2 «C7, 0.02), (B, 0.01), (-.A, 0.01)} 

And the three corresponding cropped extensions are: 

Crop(IIi) 2 {C, B, A} 

Crop(II 2 ) 2 {C,-*B,- <A} 

Crop(II 3 ) 2 (C, B,-<A} 

We may think of each of these sets of literals as signatures of their corresponding 
statistical default extensions. In what remains we propose an implementation 
of statistical default logic that computes the signatures of each extension of a 
statistical default theory. 

4 Computing Statistical Default Extensions 

In this section we describe an embedding of an important subset of statistical 
default theories into stable model semantics [6]. This embedding is designed to 
compute the signatures of each statistical default extension. Resorting to the 
available engines for computing Stable Model and Answer Set engines [14], [4] 

3 The complete cropped extensions II 1 , when e = 0.02, J7 1 and II 2 , when e = 0.03, 
are as follows: J7* = 0 02 = {A, B,AAB,C}; n} =0 03 = {A, B, A A B, C, A AC, B A C}; 
He =0 .03 = {A, B, A A B, ~'C}. 




126 



G.R. Wheeler and C. Damasio 



we indirectly provide an efficient implementation of statistical default logic. We 
start by recalling the Stable Model semantics of Gelfond and Lifschitz [5]. 

A (normal) logic program is a set of rules 4 of the form: 

ft . 0y [ , . . . , Q>rm TIOt tty 7 t-)-l, . . . , TLOt CL n 

where ft., and Oi(0 < i < n) are atoms of a given first-order language. Atom ft. 
is the head of the rule, whilst a \, . . . , a m , not a m + 1 , . . . , not a n is the body. We 
say that not aj is a default negated atom. A fact is a rule with an empty body 
and is succinctly represented by ft.. A rule with free variables stands for all its 
ground instances. 

Definition 3. Let P be a (ground) normal logic program and M a set of ground 
atoms in the language of P (i.e. a subset of the Herbrand base of P). The reduct 
P M is the default negation free program obtained from P by: 

1. Removing all rules of P having a default negated atom not a in the body such 
that a £ M. 

2. Removing all occurrences of default negated atoms in the bodies of the re- 
maining rules. 

The set M is a stable model of P iff M is the least Herbrand model of P M . 

The Answer Set Semantics [6] generalizes the Stable Model Semantics for the 
so called extended logic programs. Extended logic programs consist of rules: 

l . l \ , . . . , l m , not , . . . , not l n 

where l and Z,s are literals, i.e. atoms (say, a) or the explicit negation of atoms 
(say, ~<a). The semantics is given now by special sets of ground literals, the 
answer sets, extending Definition 3. The reduct operation for extended logic 
programs is defined similarly, but the fixpoint equation must be changed to take 
into account that the reduct program is no longer a Horn program. Essentially, 
it interprets a explicit negated literal -i a as a new atom, unrelated to a, and the 
least model is computed as before. A special condition is then added to treat 
the case of the set of all literals. The reader is referred to [6], [11] for details. 

The relationships of stable model and answer set semantics with default logic 
are very well understood. See for instance [11] for a full account. In the rest of 
this section we extend the existing results to statistical default logic in order to 
compute statistical default extensions via stable model logic programming en- 
gines. A first difficulty lies in the impossibility of representation of real numbers. 
Furthermore, the existing implementations have support only for arithmetic over 
the natural/integer numbers. The following condition allows the translation of 
the arithmetic operations over real numbers into corresponding operations over 
natural numbers: 

4 We use : — instead of «— in order to respect the syntax used in the existing imple- 
mentations. 




Ail Implementation of Statistical Default Logic 



127 



Definition 4. Let p be a non-zero natural number. A statistical default theory 
A s = ( W , S) is precision limited by p, if every error bound e in W and S is a 
rational number e = for some natural number e such that 0 < e < p. 

We cannot translate arbitrary statistical default theories, due to the difficul- 
ties of handling statistical inferences with disjunctive formulae with the proposed 
embedding. Thus, we restrict ourselves to the following types of theories: 

Definition 5. A literal statistical default theory is a statistical default theory 
A s = ( W , S) such that: 

1. Every bounded sentence in W is of the form (l,e), where l is a literal. 

2. Every statistical default in S is of the form 

l\ A ... A Irn . j i, ...,j n 

e 

c 

where 1 1 , . . . , l m ,j l, • • • ,j n and c are all literals. 

Before we proceed, we require the following auxiliary notation. Given a lit- 
eral l = a(ti, . . . ,t m ) or l = . . . ,t m ), by l[e\ it is meant, respectively, the 

new atom a(ti, . . . , t m , e) or neg a(t\ , . . . ,t m ,e). This function adds a new argu- 
ment for propagation of error-bounds, and introduces a new predicate name for 
negated atoms. Similarly, by crop(l) we mean the new atom crop_a(ti , . . . , t rn ) 
or cropjnegja{t\, . . . ,t m ). 

Definition 6. Consider the literal statistical default theory A s = (W, S) pre- 
cision limited by p. Construct the logic program (error, p) as follows, where 
error < p is a natural number such that: 

1. A bounded sentence (l,e) in W is translated into the fact: 

m- 

2. For every literal l in the language add the ride 

crop(l ) : — l[E\. 

3. Every statistical default in S of the form 



■ jl 5 jn 

e 

c 

is translated into the rule, where eps = e x p: 

c[eps] : — eps <= error , not crop(~>j i), . . . , not crop(^j„). 
4- Every statistical default in S of the form 

l\ A ... A l m . j i , ..., j n 




128 



G.R. Wheeler and C. Damasio 



is translated into the rule: 

c[A m ] : - hlEi] 5 • • • ? lm \Em\i 

Ai = eps + El , . . . , A m = A m _ 1 + E m , A m <= error, 
not crop(->ji ), . . . , not crop(^j n ). 

where eps = e x p, and E \, . . . , E m and Ai , . . . , A m are new free variables. 

Complete the program P ^ with the following closure rules, for every combination 
of atoms a and b in the language: 

a[E\ : — b[E\], E = El + E2, E <= error. 

->a[E\ : — b[E\], ~ib[E 2 ],E = El + E2, E <= error. 

For simplicity, we assume that the sum operation, as well as the equality and 
arithmetic comparison predicates are built-in. Theoretically, this can be captured 
by an infinite set of ground facts of the form X = Y + Z , such that variables are 
substituted by natural numbers x, y, z obeying the equation; the same applies 
to facts of the form X <= Y, where X and Y are instantiated with two natural 
numbers x < y. 

The translation is self-explanatory. The first case takes care of the theory 
W; by design of statistical default logic, it is assumed that the knowledge W 
is considered to be error free. The rules introduced in the 2nd step implement 
the crop operation. The translation of statistical defaults is now immediate, 
where error-bounds are propagated from the bodies to the head of rules, taking 
into account the global threshold error and the error-bound of the default. The 
justifications are translated into default negations of the complements, as usual in 
the relationships of default logic with answer set semantics. The last sets of rules 
encode the explosive behavior of statistical default logic in face of contradiction, 
which differs from the one of Answer Set Semantics. The major result is the 
following: 

Theorem 1 . Consider a literal statistical default theory A s = {W, S) with error- 
bound parameter e, and precision limited by p, and let error = exp be a natural 
number. Then, a set of ground literals {Zj_ , . . . , f , . . .} is contained in Crop(II), 
where II is a statistical default extension II of A s , iff there is a stable model of 
program P^(error,p) containing {crop{l \), . . . , crop{U ), . . .}. 

By resorting to the known translation of extended logic programming under 
the answer set semantics into default logic [ 11 ] and the relationship of statistical 
default logic with Reiter’s default logic we obtain the following corollary: 



Corollary 1. Let P be a extended logic program and construct the statistical 
default theory Ap = (0, S) by including in S a default 



for each rule 



Zi A...AZ. 



m • 1 - 1 , 



l 



- 0.0 



l . l \ , . . . , Im , not Impi , . . . , not l n 




Ail Implementation of Statistical Default Logic 



129 



in the extended logic program. Then, M is an answer set of P iff II is a statistical 
default extension of Ap such that Cn(M) = Crop(II), where Cn is the first- 
order consequences operator. 

We conclude by illustrating the embedding: 

Example 3. Consider the theory of Example 1 with error-bound threshold of 
0.03, and precision limited by 100. The translated normal logic program is: 

crop_a :- a(_) . 
crop_b :- b(_). 
crop_c :- c(_). 
crop_neg_a :- neg_a(_) . 
crop_neg_b :- neg_b(_). 
crop_neg_c :- neg_c(_). 

a(l) :- 1 <= 3, not crop_neg_a. 
b(l) :- 1 <= 3, not crop_neg_b. 

c (Al) :- a(El) , A1 = 1 + El, A1 <= 3, 

not crop_neg_b, not crop_neg_c. 

neg_c(A2) :- a(El) , b(E2), 

Al = 1 + El, A2 = Al + E2, A2 <= 3, not crop_c. 



a(E) : - 


a(El) , neg_a(E2) , E = 


El + E2, 


E <; 


= 3. 


neg_a(E) 


:- a(El) , neg_a(E2) , 


E = El + 


E2 , 


E <= 3 


a(E) : - 


b(El) , neg_b(E2) , E = 


El + E2, 


E <; 


= 3. 


neg_a(E) 


:- b(El) , neg_b(E2) , 


E = El + 


E2 , 


E <= 3 


a(E) : - 


c(El), neg_c(E2), E = 


El + E2, 


E <; 


= 3. 


neg_a(E) 


:- c (El) , neg_c(E2) , 


E = El + 


E2 , 


E <= 3 


b(E) : - 


a(El) , neg_a(E2) , E = 


El + E2, 


E < = 


= 3. 


neg_b(E) 


:- a(El) , neg_a(E2) , 


E = El + 


E2 , 


E <= 3 


b(E) : - 


b (El) , neg_b(E2) , E = 


El + E2, 


E < = 


= 3. 


neg_b(E) 


:- b(El) , neg_b(E2) , 


E = El + 


E2 , 


E <= 3 


b(E) : - 


c(El), neg_c(E2), E = 


El + E2, 


E <: 


= 3. 


neg_b(E) 


:- c (El) , neg_c(E2) , 


E = El + 


E2 , 


E <= 3 


c (E) : - 


a(El) , neg_a(E2) , E = 


El + E2, 


E < = 


= 3. 


neg_c (E) 


:- a(El) , neg_a(E2) , 


E = El + 


E2 , 


E <= 3 


c (E) : - 


b (El) , neg_b(E2) , E = 


El + E2, 


E <: 


= 3. 


neg_c (E) 


:- b(El) , neg_b(E2) , 


E = El + 


E2 , 


E <= 3 


c (E) : - 


c(El), neg_c(E2), E = 


El + E2, 


E < = 


= 3. 


neg_c (E) 


:- c (El) , neg_c(E2) , 


E = El + 


E2 , 


E <= 3 




130 



G.R. Wheeler and C. Damasio 



The stable models of the above program are: 

{a(l) , b(l), neg_c(3), crop_a, crop_b, crop_neg_c} 

{a(l) , b(l), c(2), crop_a, crop_b, crop_c} 

which correspond exactly to the signature statistical default extensions of Ex- 
ample 1. 

Example 4- Consider the theory of Example 2 with error-bound threshold of 
0.03, and precision limited by 100. The translated logic program is: 

crop_a :- a(_) . 
crop_b :- b(_). 
crop_c :- c(_). 
crop_neg_a :- neg_a(_) . 
crop_neg_b :- neg_b(_). 
crop_neg_c :- neg_c(_). 

a(l) :- 1 <= 3, not crop_b, not crop_neg_a. 
neg_a(l) :- 1 <= 3, not crop_a. 

b(Al) :- c(El), A1 = 1 + El, A1 <= 3, not crop_neg_b. 
neg_b(3) :- 3 <= 3, not cropJb. 

c(0) :- 0 <= 3, not cropJb, not crop_neg_c. 
c(2) :- 2 <= 3, not crop_neg_c. 



a(E) : - 


a(El) , neg_a(E2) , E = 


El + E2, 


E <; 


= 3. 


neg_a(E) 


:- a(El) , neg_a(E2) , 


E = El + 


E2 , 


E <= 3 


a(E) : - 


b (El) , neg_b(E2) , E = 


El + E2, 


E <= 


= 3. 


neg_a(E) 


:- b(El) , neg_b(E2) , 


E = El + 


E2 , 


E <= 3 


a(E) : - 


c(El), neg_c(E2), E = 


El + E2, 


E <= 


= 3. 


neg_a(E) 


:- c (El) , neg_c(E2) , 


E = El + 


E2 , 


E <= 3 


b(E) : - 


a(El) , neg_a(E2) , E = 


El + E2, 


E <: 


= 3. 


neg_b(E) 


:- a(El) , neg_a(E2) , 


E = El + 


E2 , 


E <= 3 


b(E) : - 


b (El) , neg_b(E2) , E = 


El + E2, 


E <: 


= 3. 


neg_b(E) 


: - b(El) , neg_b(E2) , 


E = El + 


E2 , 


E <= 3 


b(E) : - 


c(El), neg_c(E2), E = 


El + E2, 


E <: 


= 3. 


neg_b(E) 


:- c (El) , neg_c(E2) , 


E = El + 


E2 , 


E <= 3 


c (E) : - 


a(El) , neg_a(E2) , E = 


El + E2, 


E < = 


= 3. 


neg_c (E) 


:- a(El) , neg_a(E2) , 


E = El + 


E2 , 


E <= 3 


c (E) : - 


b (El) , neg_b(E2) , E = 


El + E2, 


E < = 


= 3. 


neg_c (E) 


:- b(El) , neg_b(E2) , 


E = El + 


E2 , 


E <= 3 


c (E) : - 


c(El), neg_c(E2), E = 


El + E2, 


E <: 


= 3. 


neg_c (E) 


:- c (El) , neg_c(E2) , 


E = El + 


E2 , 


E <= 3 




Ail Implementation of Statistical Default Logic 



131 



The stable models of the above program are: 

{neg_a(l) , neg_b(3), c(0), c(2), crop_neg_a, crop_neg_b, crop_c} 

{neg_a(l) , b(3), c(2), crop_neg_a, crop_b, crop_c} 

{a(l) , neg_b(3), c(0), c(2), crop_a, crop_neg_b, crop_c} 

which correspond exactly to the signature statistical default extensions of Ex- 
ample 2. 

5 Comparisons 

Literal statistical default theories have interesting connections to existing prob- 
abilistic logic programming frameworks, namely the Stable Semantics for Prob- 
abilistic Deductive Databases [13]. A default ilA -Wi,-,jn e> e < l in a 
literal statistical default theory can be translated into a general probabilistic 
logic program of Ng and Subrahmanian [13] of the form 5 : 

eps: [1 — e, 1] A- 

prereq: [V, 1] ( eps A l\ A ... A l m ): [V, 1] 

c: \V. 1] 1— prereq: [1 — error , 1] /\ prereq: [V, 1] /\ 

not -iji : [1 — error , 1] A • • • A n °t “V7n : [1 — error , 1] 

Note that V is an annotation variable, and error is the fixed error-bound thresh- 
old parameter. The translation of the closure rules is immediate and there is no 
need to introduce crop sentences, since this is already accommodated in the tests 
not -iji : [1 — error, 1] and prereq: [1 — error, 1], 

The translation is justified by the observation that a literal l with error-bound 
e is equivalent to saying that the probability of l is in the interval [1 — e, 1]. Now, 
if the error-bound of a literal l\ (resp. I - 2 ) is ei (resp. £2) this means that the 
probability of l\ is between [1 — ei, 1] (resp. 1 2 between [1 — e 2 ] ^ 1]) - Thus the 
probability of Zi AZ 2 is between [1 — (ei+e 2 ), 1], if ei+e 2 < 1. Now, the conjunction 
symbol in ( eps A l\ A ... A l m ): [V, 1]) corresponds to the conjunctive ignorance 
probabilistic strategy of Hybrid Probabilistic Logic Programs [3], which combines 
the probability intervals [ai,&i] and [a 2 ,& 2 ] according to: 

[ai, 61] A [a 2 , b 2 \ = [max(0, aq + a 2 — 1), min(&i, 6 2 )] 

By applying the ignorance strategy to the previous intervals for l\ and 1 2 we 
obtain the expected result: 

[1 - ei, 1] A [1 - e 2 , 1] = [max(0, (1 - ei) + (1 - e 2 ) - l),min(l, 1)] 

= [max(0, 1 — ei — e 2 ), 1] = [max(0, 1 — (ei + e 2 ), 1] 

5 The authors use -1 instead of not to represent default negation. We use here not in 
order to avoid confusion with the previous translations. 




132 



G.R. Wheeler and C. Damasio 



It is now obvious that the framework of [13] is expressive enough to capture lit- 
eral statistical default theories. However, the authors do not present in [13] any 
translation into stable model semantics, which we have provided here. Further- 
more, the more recent Hybrid Probabilistic Logic Programming framework [3] 
does not provide a default negation construction and thus cannot embed literal 
statistical default theories. 

A translation of disjunctive logic programs with probabilistic semantics into 
stable models is presented in [9], but assumes positively correlated interpreta- 
tions, i.e. the probability of AAB is given by the minimum of the probability of A 
and the probability of B. Since SDL is intended to be quite general and therefore 
adopts an ignorance strategy for combination, this framework does not appear 
to be able to capture statistical default theories. Lukasiewicz also proposed an 
approach for reasoning from statistical and subjective knowledge, based on the 
combination of probabilistic conditional constraints with default reasoning [10], 
but the relationships to our work remain to be studied. 



6 Conclusions 

In this paper we have presented an embedding of Literal Statistical Default 
theories into stable model semantics. The embedding is designed to compute the 
signature set of literals that uniquely distinguishes each extension on a statistical 
default theory. We also offered a comparison of this work to existing probabilistic 
logic programming frameworks, highlighting the new contribution of our results. 6 



References 

[1] Bacchus, F., A. Grove, J. Halpern and D. Koller. 1993. “Statistical Foundations 
for Default Reasoning,” Proceedings of The International Joint Conference on 
Artificial Intelligence 1993 (IJCAI-93), 563-569. 

[2] Cramer, H. 1946. Mathematical Methods of Statistics , Princeton: Princeton Uni- 
versity Press. 

[3] Dekhtyar, A. and V.S. Subrahmanian, 2000. “Hybrid Probabilistic Programs”, 
Journal of Logic Programming , 43(3): 187-250. 

[4] Eiter, T., N. Leone, C. Mateis, G. Pfeifer and F. Scarcello. 1998 “The KR system: 
Progress Report, Comparisons and Benchmarks,” KR ‘98: Principles of Knowl- 
edge Representation and Reasoning, Cohen, A., L. Schubert and S. Shaprio [eds.]. 
San Francisco: Morgan Kaufmann. 

[5] Gelfond, M. and V. Lifschitz 1988. “The Stable Model Semantics for Logic Pro- 
gramming,” Proceedings of the 5th International Conference on Logic Program- 
ming, [ed.] Kowalski, R. and K. Bowen. Cambridge: MIT Press, pp. 1070-1080. 

[6] Gelfond, M. and V. Lifschitz 1990. “Logic Programs with Classical Negation,” 
Proceedings of the 7th International Conference on Logic Programming, Warren, 
D. and P. Szeredi [eds.]. Cambridge: MIT Press, 579-597. 



This research was supported by grant SES 990-6128 from the National Science Foun- 
dation and SFRH/BPD- 13699-2003 from Fundagao para Ciencia e a Tecnologia. 




Ail Implementation of Statistical Default Logic 



133 



[7] Kyburg, H. E., Jr. and C. M. Teng.1999. “Statistical Inference as Default Logic,” 
International Journal of Pattern Recognition and Artificial Intelligence , 13(2): 
267-283. 

[8] Larsen, R. J. and M. L. Marx. 2001. An Introduction to Mathematical Statistics, 
Upper Saddle River, NJ: Prentice Hall. 

[9] Lukasiewicz, T. 2001. “Fixpoint Characterizations for Many- Valued Disjunctive 
Logic Programs with Probabilistic Semantics” , appearing in Proceedings of the 6th 
International Conference on Logic Programming and Nonmonotonic Reasoning 
(LPNMR-01), Vienna, Austria, September 2001. Volume 2173 of Lecture Notes 
in Artificial Intelligence, Springer, 336-350. 

[10] Lukasiewicz, T. 2002. “Probabilistic Default Reasoning with Conditional Con- 
straints”, Annals of Mathematics and Artificial Intelligence, 34(1-3): 35-88. 

[11] Marek and Truszczynski 1993. Nonmonotonic Logic, Berlin: Springer- Verlag. 

[12] Moore, 1979. Statistics, San Francisco: W. H. Freeman Press. 

[13] Ng, Ramond and V. S. Subrahmanian. 1994. “Stable semantics for probabilistic 
deductive databases”, Information and Computation, 110(1): 42-83. 

[14] Niemela, I. and P. Simons. 1996. “Efficient Implementation of the Well-founded 
and Stable Model Semantics,” Proceedings of the Joint International Conference 
and Symposium on Logic Programming, Maher, M. [ed.]. Cambridge: MIT Press. 

[15] Reiter, R. 1980. “A Logic for Default Reasoning,” Artificial Intelligence, 13: 81- 
132. 

[16] Wheeler, G. R. 2004. “A Resource Bounded Default Logic”, in James Delgrande 
and Torsten Schuaub (eds.) Proceedings of the 10th International Workshop on 
Non-monotonic Reasoning (NMR-200J), Whistler, British Columbia, 416-422. 




Capturing Parallel Circumscription with 
Disjunctive Logic Programs* 



Tomi Janhunen and Emilia Oikarinen 

Helsinki University of Technology, Laboratory for Theoretical Computer Science 
P.O. Box 5400, FIN-02015 HUT, Finland 
Tomi . JanhunenShut . f i, Emilia. OikarinenOhut . f i 



Abstract. The stable model semantics of disjunctive logic programs is based on 
classical models which are minimal with respect to subset inclusion. As a conse- 
quence, every atom appearing in a disjunctive program is false by default. This 
is sometimes undesirable from the knowledge representation point of view and 
a more refined control of minimization is called for. Such features are already 
present in Lifschitz’s parallel circumscription where certain atoms are allowed to 
vary or to have fixed values while all other atoms are minimized. In this paper, 
it is formally shown that the expressive power of minimal models is properly in- 
creased in the presence of varying atoms. In spite of this, we show how parallel 
circumscription can be embedded into disjunctive logic programming in a rela- 
tively systematic fashion using a linear and faithful, but non-modular translation. 
This enables the conscious use of varying atoms in disjunctive logic programs — 
leading to more elegant and concise problem representations in various domains. 



1 Introduction 

In disjunctive logic programming, a rule-based language which allows disjunctions in the 
heads of rules is used for knowledge representation. Along the development of efficient 
implementations such as dlv [15] and GnT [ 13], various problems have been formalized 
as disjunctive logic programs. The semantics of disjunctive logic programs is determined 
by stable models [8,20] which are minimal with respect to subset inclusion. This makes 
every atom appearing in a disjunctive logic program false by default. In many cases, this 
is highly desirable, but certain problems become awkward to formalize if all atoms are 
blindly subject to minimization. This suggests a revision of the stable model semantics 
in order to incorporate atoms that are not false by default. 

The need of atoms, which are not subject to minimization, has already been realized 
in conjunction with normal logic programs which form a special case of disjunctive 
logic programs. Simons [23] introduces choice rules which allow the definition of atoms 
not being false by default. The same effect can be obtained by allowing negation as 
failure in the heads of disjunctive rules [9]: a rule of the form a V ~a represents the 
fact that a can be true or false. As shown by the first author [11], negation as failure 
can be removed from the heads of disjunctive rules using a linear transformation. This 

* The research reported in this paper has been partially funded by the Academy of Finland (project 
#53695) and the European Commission (contract IST-FET-2001-37004). 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, pp. 134-146, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




Capturing Parallel Circumscription with Disjunctive Logic Programs 135 



implies that choice rules can be effectively expressed using disjunctive rules. However, 
it is important to realize that atoms definable in this way are essentially fixed atoms in the 
sense of parallel circumscription [16] which is based on a refined notion of minimality. 

In addition to fixed atoms and those subject to minimization, parallel circumscription 
incorporates yet another category of atoms, namely atoms that are allowed to vary. As 
demonstrated by Lifschitz’s ostrich example [16], varying atoms tend to increase the 
knowledge representation capabilities of ordinary circumscription [ 1 8] where all atoms 
are subject to minimization. Unfortunately, varying atoms are not yet well-supported 
in disjunctive logic programming, although serious attempts to embed parallel circum- 
scription into disjunctive logic programming have already been made. The approach by 
Gelfond and Lifschitz [7] is restricted to the stratihable case and the one by Sakama 
and Inoue [22] involves characteristic clauses which imply an exponential time/space 
complexity in the worst case. Quite recently, Lee and Lin [14] characterize parallel 
circumscription in terms of loop formulas and then embed parallel circumscription in 
disjunctive logic programming using them. However, the number of loops can be ex- 
ponential in the worst case. Thus it remains open whether an efficient translation from 
parallel circumscription into disjunctive logic programs is feasible in the general case. 

The goal of this paper is to develop such a translation — enabling the conscious use 
of varying atoms in disjunctive logic programs. We proceed as follows. In Section 2, we 
review the syntax and semantics of disjunctive logic programs and present the notion 
of visible equivalence to enable natural comparisons of programs. Then the effects of 
varying and fixed atoms on the expressiveness of positive disjunctive programs are 
studied in Section 3. The key result is that varying atoms lead to a proper increase 
in expressive power which we believe to explain the above mentioned difficulties in 
translating parallel circumscription. A linear but non-modular translation for removing 
varying atoms is presented in Section 4. This paper is concluded by Section 5 where we 
also sketch potential applications of varying atoms in disjunctive logic programming. 

2 Disjunctive Logic Programs Revisited 

In this section, we review the basic concepts of disjunctive logic programming in the 
propositional case. A disjunctive logic program (DLP) 77 is a set of rules of the form 

(i\ V * * • V (x n i b\ , . . . , b rn , ~Ci , . . . , , (1) 

where n, m, k > 0 and or, . . . , a n , bi, . . . , b m , and ci, . . . , Cfc are propositional atoms. 
The head of the rule a\ V • • • V a n is interpreted disjunctively while the rest forming 
the body of the rule is interpreted conjunctively. The symbol denotes negation as 
failure to prove', or default negation for short. Intuitively, a rule of the form (1) acts as an 
inference rule: any of the head atoms a±, ... ,a n can be inferred given that the positive 
body atoms b\ , . . . , b m can be inferred and the negative body atoms ci , . . . , Cfc cannot. 

We define literals in the standard way using ~ as the connective for negation. For 
any set of atoms A, we define a set of negative literals = {~a | a £ A}. Since the 
order of atoms is insignificant in a rule (1), we use a shorthand A 3— 77, ~C where A, 
B and C are the sets of atoms involved in (1). If necessary, we separate rules with full 
stops and we drop the symbol “3—” in case of an empty body. An empty head (n = 0) is 




136 



T. Janhunen and E. Oikarinen 



denoted by “_L” and a rule with an empty head is called an integrity constraint. A DLP 
77 is positive if and only if fc = 0 holds for every rule (1) of 77. We remind the reader that 
positive DLPs (PDLPs) can be viewed as propositional theories in conjunctive normal 
form (CNF) which can be obtained in linear time using new atoms. 

2.1 Semantics: Minimal and Stable Models 

We define the Herbrand base Hb(77) of a DLP Ft as a set of atoms which contains 
all atoms appearing in 77. Due to flexibility of this definition, we view Hb(77) as the 
symbol table of 77 so that it contributes to the length of 77 in symbols, denoted by ||77||. 
Following the ideas from [12], we partition Hb(77) into two parts Hb v (77) and Hbh(77) 
which determine the visible and the hidden parts of Hb(77), respectively. The visibility 
of atoms becomes important in Section 2.2 where the equivalence of DLPs is of interest, 
but for now we concentrate on defining the semantics of propositional DLPs. 

An interpretation I C Hb(77) of 77 determines which atoms a £ Hb(77) are true 
(a £ I) and which are false (a f I). An interpretation 7 is a (classical) model of 77, 
denoted by 7 j= 77, if and only if for every rule A £- 77, ~C of 77, B C 7 and C(T7 = 0 
imply A (T 7 ^ 0, i.e. the satisfaction of the rule body implies that one of the head atoms 
must also be true. It is customary to distinguish minimal models of a DLP 77, i.e. models 
717 |= 77 for which there are no other models TV |= 77 such that TV C 717 . The set 
of minimal models of 77 is denoted by MM(77). If 77 is a positive DLP, then MM(77) 
determines the standard minimal model semantics of 77. Unfortunately, minimal models 
do not properly capture intuitions behind DLPs involving default negation, but stable 
models [8,20] provide a reasonable semantics for such programs. 

Definition 1. Given a DLP 77 and an interpretation M C Hb(//j, the Gelfond-Lifschitz 
reduct of 77 is a positive DLP 

n M = {A£- B | B,~C £ 77 and MAC = 0}. (2) 

An interpretation M C Hb(77) is a stable model of 77 if and only if M £ MM(77 m ). 

Given a DLP 77, we let SM(77) denote the set of stable models of 77. Any two DLPs 
77 and IT are considered to be equivalent under the stable model semantics, denoted 
by 77 = 77', if and only if SM(77) = SM(77'). For instance, we have 77 = 77' for 
77 = {« V b. } and 77' = {a <- b £- ~a. }, as SM(77) = {{a}, {6}} = SM(77'). 
The preceding definition of = is justifiable from the viewpoint of formalizing a problem 
at hand as a DLP 77 : the stable models of the program 77 are often supposed to be in a 
one-to-one correspondence with the solutions of the problem. If 77 = 77' holds for two 
programs 77 ^ 77' formalizing the same problem, then the same solutions are obtained. 

2.2 Visible Equivalence 

A drawback of the relation = is that it does not take the visibility of atoms into account. 
It is typical that a DLP 77 contains atoms formalizing certain auxiliary concepts local to 
77. Such atoms carry little relevance for other programs. This is why we adopt a slightly 




Capturing Parallel Circumscription with Disjunctive Logic Programs 137 



more general notion of equivalence [12] which treats the visible part Hb v (77) of the 
Herbrand base Hb( 77) as the program interface of 77. The key idea is that the hidden 
atoms in H b|, ( II) = Hb(77) \ Hb v (77) can be viewed local to II and hence negligible 
as far as the equivalence of II with other programs is concerned. The definition below 
is given relative to the sets of interpretations SEM(77) and SEM(77') which determine 
the semantics of II and 77' , respectively. We need this kind of flexibility in Section 3 
when we compare PDLPs which are (possibly) based on different semantics than the 
stable semantics. The reader may assume SEM(TT) = SM(77) unless otherwise stated. 

Definition 2. Two DLPs II and II' are visibly equivalent, denoted by 77 = v 77', if and 
only ;/Hb v (77) = Hbv(77') and there is a bijection f : SEM(TJ) — > SEM(77') such 
that for all interpretations M £ SEM(77), M Cl Hb v (77) = /(M) D Hb v (77'). 

It is easy to verify that = v is an equivalence relation. To compare = v with =, 
we note that these two relations coincide given that Hbh(77) = Hbh(77') = 0 and 
Hb(77) = Hb(77'). The latter condition is actually of little account, as it can be readily 
satisfied e.g. by extending Herbrand bases with “useless” rules of the form a <— a. 

Example 1. Consider logic programs 77 = {a 3— b. a 3— c. b 3— ~c. c 3— ~6. } and 
77' = {a 3— d , ~e. a 3— e, ~d. d V e. } with Hb v (77) = Hb v (77') = {a}. The stable 
models of 77 are M\ = {a, b} and M 2 = {a, c} whereas for 77' they are N\ = {a, d} and 
-ZV 2 = {a, e}. Thus 77 ^ 77' is clearly the case, but we have a bijection / : SM(77) — > 
SM(TT) which maps M. t to TV) for i £ {1, 2}. Hence 77 = v II' . □ 

3 Parallel Circumscription and Its Expressive Power 

In this section, we analyze the expressive power of Lifschitz’s parallel circumscrip- 
tion [16] by studying the effects of denying varying atoms and/or fixed atoms on the 
expressiveness of minimal models. In analogy to Section 2, we formulate parallel circum- 
scription in the propositional case. Rather than using arbitrary propositional sentences 
to formulate propositional theories, we assume that the syntax of PDLPs is used. As 
discussed already in the introduction, parallel circumscription is based on a notion of 
minimality which partitions atoms in three disjoint categories. 

Definition 3. Let 77 be a PDLP and let V C Hb(77) and F C Hb(77) he two sets of 
atoms satisfying V (T F = 0. A model M \= 77 is ( V, , F)-minimal TV (= 77 such 

that (i) N \ (V U F) C M \ (V U F) and (ii) N D F = M D F. 

The idea is that the atoms in Hb(77) \(TU7) are subject to minimization in analogy 
to Section 2.1. However, while such a minimization takes place, the truth values of the 
atoms in V may vary freely and the truth values of the atoms in F are kept fixed. The 
set of all (V, F) -minimal models of 77 is denoted by MMv;f( 77). It is customary in 
disjunctive logic programming that all atoms are subject to minimization, i.e. (0,0)- 
minimal models of a positive DLP 77 are of interest. Under this restriction, the first 
condition of Definition 3 is equivalent to TV C M while the second condition becomes 
void. Thus MM(77) = MM 0 0(77). In the sequel, we are interested in the problem of 
determining (P, F) -minimal models for a given positive DLP 77. Note that V C Hb(77) 




138 



T. Janhunen and E. Oikarinen 



and F C Hb( 77) are separately specified for each program 77 and are thus viewed as 
parts of the respective programs. For now, we concentrate on answering the following 
question: is it possible to remove fixed and varying atoms by translating a PDLP involving 
such atoms into another PDLP not containing such atoms? 

3.1 PFM Translation Functions 

To answer the preceding question, we apply an analysis method [11,12] which is based on 
the existence of polynomial, faithful and modular translation functions between classes 
of logic programs. These properties are formalized in Definition 5 below, but first we 
state conditions on which two DLPs 77 and 77' are viewed as separate program modules 
that can be combined together to form a larger program 77 U II'. 1 

Definition 4. Two PDLPs 77 and 77' satisfy module conditions if and only if 77 n I !' = 0, 
Hb v (77) = Hb v (77'), Hb h (77) D Hb(TT) = 0, Hb(77) D Hb h (T7') = 0. 

The intuition behind the conditions listed in Definition 4 is that the program modules 
77 and 77' possess identical program interfaces for mutual interaction and they do not 
share rules nor hidden atoms. If 77 and 77' share rules, then 77 \ 77', 77' \ 77, and 77 (T 77' 
might be identified as disjoint program modules, if admitted by the other conditions. 

Definition 5. Let C and C be two classes of logic programs. A translation function 
Tr : C — >■ C' is defined to be 

1. polynomial, iff for all programs II € C,thetranslationTr(II ) € C can be computed 
in time (and hence also space) polynomial to ||77||; 

2. faithful, iff for all programs 77 C, II = v Tr(77); 

3. modular, iff for all programs 77 £ C and II' £ C satisfying module conditions, 
the translation Tr(77 U 77') = Tr(77) U Tr(77') where the translations Tr(77) and 
Tr (77') satisfy module conditions. 

It can be shown that these three properties are preserved under compositions [12]. In 
particular, the modularity condition differs from the one used in [1 1], This is to support 
translation functions between classes oflogic programs (or like) that do not share syntax. 
Moreover, the module conditions in Definition 4 are more liberal than those used by Eiter 
et al. [6] which enables richer interaction between program modules. 

In the sequel, we use the existence of a polynomial, faithful and modular (PFM) 
translation function as a criterion when comparing classes of logic programs by expres- 
sive power. A class of logic programs C is at least as expressive as another class C' iff 
there is a PFM translation function Tr : C — > C. We write C <pfm C to denote such 
a relationship. If both C <pfm C and C <pfm C hold, then C and C are regarded 
as equally expressive classes, denoted by C =pfm C . In certain cases, we succeed to 
find a counter-example to establish a negative relationship C jfpFM C' . 2 If, in addition, 
C' <pfm C holds, then C is strictly more expressive than C , denoted by C <pfm C. 
Finally, two classes may also turn out to be incomparable in terms of PFM translation 
functions, denoted by C ^pfm C , if and only if both C ^pfm C and C ^pfm C hold. 

1 The symbol U denotes disjoint union. 

2 Sometimes we do not need all the three properties to form a counter-example and we may drop 
the respective letters from the notation. E.g. C ^fm C' implies C ^pfm C in general. 




Capturing Parallel Circumscription with Disjunctive Logic Programs 139 



3.2 Expressiveness Analysis 

In this section, we apply the classification method presented in Section 3.1 to analyze 
29+ vf which is defined as the class of PDLPs involving atoms being minimized (m), 
varying atoms (v), and fixed atoms (f). The semantics of a PDLP 27 from this class 
is determined by MM y jF (27) rather than SM(27) = MM(27); recall that II has the 
sets V C Hb(27) and F C Hb(27) associated with it. We obtain six subclasses of 
29+ vf by insisting that one or two of the sets Hb(27) \ (V U F), V, and F are empty 
for PDLPs included in the subclass. Such a restriction corresponds to denying mini- 
mized/varying/fixed atoms and we drop the corresponding letter(s) from the notation 
when referring to the respective subclass of 29+ vf . For instance, 29+ denotes the class 
of PDLPs under the standard semantics according to which all atoms are subject to 
minimization, i.e. the sets V and F are both empty for all PDLPs 27 within this class. 

We begin the analysis with fixed atoms. It is a well-known fact that they can be 
eliminated in general [3], but our interest in this respect is to check that the elimination 
can be accomplished using a PFM translation function. 

Theorem 1. 29+ fv < PFM £>mv and <pfm £>m- 

Proof (sketch) Let 27 be a PDLP, and V and F the sets of varying and fixed atoms, 
respectively. The class 29+ f can be covered by further assuming V = 0. De Kleer and 
Konolige [3] propose the following technique to remove F. A new atom f (f_ Hb(77) 
is introduced for each / £ F. The translation Ttkk(T2) = 27 U {/ V /'. _L /, f. \ 
f £ Fj with the set of atoms (Hb(27) \ V) U {/' | f £ F} subject to minimization. 
The visible Herbrand base Hb v (TrKK(^0) can be defined as Hb v (27). 

It is easy to see that Tr KK is linear. For the faithfulness of Tr KK , we note that ( V. F)- 
minimal models M of 72 are in a bijective relationship with the ( V. 0)-minimal models 
AT' = M U {/' | / £ F and / ^ AT} of TrKK^)- For the modularity of Ttkk, we 
suppose that two PDLPs 27 and 27' with the sets of varying atoms V and V and the 
sets of fixed atoms F and F' , respectively, satisfy the module conditions. It is clear that 
TrxK(F^) an d are disjoint and Tr^^Ilun') = TrxK(F^) LI Ttkk( 72 / ), as 

27 and 27' as well as F and F' are disjoint by the module conditions. Moreover, we have 
Hb v (TrKK:(F0) = Hb v (TrKK:(FF)) by definition, because Hb v (27) = Hb v (72') by 
the module conditions. Finally, the translations TrKi<(27) and TrKK(FF) do not share 
hidden atoms as the modules 27 and 27' do not. □ 

Thus 29+ v C 29+ fv and 29+ C 29+ f imply 29+ v = PFM 29+ fv and 29+ = PFM 29+ f . 

Theorem 2. 29+ v ^ FM 29+ 

Proof. Let us assume that there is a polynomial and faithful translation function Tr : 
29+ v — ► 29+ that effectively removes varying atoms. Then consider two disjoint logic 
programs 27i = (a V b. } and II 2 = (_L 3— 6, a. } based on Hb(27i) = Hb(272) = 
(a, &} with all atoms visible, i.e. Hbh(27i) = Hbh(272) = 0. Then let us define Vj = {a} 
and I +2 = {b} as the sets of varying atoms associated with 27i and II 2 , respectively. As 
regards 27 \ and 27 2 , it is straightforward to verify that 

1. the only (Vi, 0) -minimal model of 27i is M\ = {a}; 




140 



T. Janhunen and E. Oikarinen 



2 . the program II 2 has two (V 2 , 0) -minimal models M 2 = {b} and M 3 = 0; and 

3. the program 27i U II 2 has two (Vi U V 2 , 0)-minimal models Mi and M 2 . 

On the other hand, the translations Tr(27i, Vi), Tr(272, V 2 ), and Tr(27i U II 2 , V\ U V 2 ) 
are PDLPs whose all atoms are subject to minimization. Since Tr is faithful, we know 
that Tr(7Ti, Vl) has a (0, 0)-minimal model N such that N fl Hb(TTi) = Mi, and 
Tr(7Ji U 172, Vi U V 2 ) has two (0, 0)-minimal models 2Vi and N 2 such that N\ fl 
Hb(77i U n 2 ) = Mi = {a} and N 2 D Hb(17i U 17 2 ) = M 2 = {b}. 

Using the modularity of Tr, we obtain Tr(17i U II 2 , Vl U V 2 ) = Tr(27i, Vl) U 
Tr ( 7 J 2 , V 2 ) . Since N 2 \= Tr (17i U II 2 , Vi U V 2 ), we obtain N 2 |= Tr (17i , Vi ) . It follows 
that N' |= Tr(77 1? Vl) holds for the restricted model N' = N 2 C\ Hb(Tr(27i, Vi)) 
from which the local atoms of Tr(272, V 2 ) have been removed. Recall that Hb(7/| ) C 
Hb(Tr(77i, Vl)) by the faithfulness of Tr. Because N' f= Tr(27i, Vt) and N is the 
unique (0, 0) -minimal model of Tr(27i, Vl), we obtain N C N' . A contradiction, since 
a £ N but a f N'. To conclude, such a translation function Tr does not exist. □ 

It follows that 22+ <pfm 22+ v , since 22+ is a subclass of 22+ v . The first condition of 
Definition 3 implies that the classes 22+, 22 f +, and 22+ f collapse to classical logic, i.e. the 
semantics assigned to a PDLP 27 is CM(77) = {M C Hb(77) | M |= 27}. Moreover, 
PFM translation functions are easily obtained for each pair of classes. E.g., a translation 
function from 22+ to 22+ simply exchanges the roles of varying and fixed atoms. This 
is semantically irrelevant, as no atoms are subject to minimization. Such a translation 
function is trivially PFM. Hence, we have 22+ =pfm T>+ =pfm T>+ f and there is only 
one relationship to be further explored. 

Theorem 3. 22+ ^ FM 22+ 

Proof. Consider 27i = {a •<— a } and II 2 = {a} which have unique (0, 0)-minimal 
models Mi = 0 and M 2 = {a}, respectively. Assuming the existence of a faithful 
and modular translation function Tr, we obtain that Tr(27!) and Tr(27 2 ) have unique 
classical models 2Vi and N 2 , respectively, such that 2Vj fl Hb(27j) = M,; for i £ (1, 2}. 
Thus Tr(27i U II 2 ) = Tr(27i) U Tr(272) is necessarily inconsistent — contradicting the 
faithfulness of Tr, as M 2 is the unique (0, 0)-minimal model of 27i U II 2 . □ 

The resulting expressive power hierarchy is summa- 
rized in Figure 1 . There are three equivalence classes un- 
der PFM The most expressive class corresponds to Lif- 
schitz’s parallel circumscription [16] while the class in 
the middle captures ordinary circumscription proposed 
by McCarthy [18]. The class at the bottom corresponds 
to classical logic, translation functions. In spite of cer- 
tain differences, these results can be understood as a 
refinement to an analogous hierarchy derived for non- 
monotonic logics [10] where the lower end of the hier- 
archy consists of parallel circumscription and classical 
logic; the former ranked strictly more expressive than 
the latter. Let us also note that current disjunctive solvers 
[15,13] cover the hierarchy up to the class in the middle. 



mv 


PFM 


Vr 

mvr 






% 


U- 






P- 






V 




V 


V 1 

rn 


PFM 




Is 






El 






Cl 






V 




V 




=FF_V1 


22 f ' 



Fig. 1 . Hierarchy Implied by the 
Expressiveness Analysis 




Capturing Parallel Circumscription with Disjunctive Logic Programs 141 



4 Eliminating Varying Atoms 

In this section, we present a non-modular translation function Ttblind which enables us 
to remove varying atoms from a PDLP II in a faithful way, i.e. ( V . , F)-minimal models 
M of II and the stable models N of its translation are in a bijective relationship such 
that M = N D Hb(77) holds for each pair of models. For the sake of simplicity, we 
assume that fixed atoms have already been removed (recall TYkk from Theorem 1). 

The translation function Ti'bi.ini) introduces new atoms, which do not appear in 
H b( II), as follows. For each a £ Hb(7T), the complement a of a expresses the falsity of 
a. Moreover, a renamed copy a* of each a £ Hb(77) is needed when formulating a test 
for (V, 0) -minimality. Likewise, a vector of new atoms d\, . . . , d n is introduced for the 
set of atoms P = Hb(77) \ V = {ai , . . . , a„} subject to minimization. Yet another new 

atom, namely u, will be used in the translation. Given a set of atoms A C Hb(77), we 

introduce shorthands A and A* for the sets {a \ a £ A} and {a* | a £ A}, respectively. 

Definition 6. Let II be a PDLP and V C Hb( // j a set of varying atoms. Let us define 
P = Hb(7T) \ V = {ai, . . . , a n } and a translation TrBLiND(^) containing 

1. rules a £- and a £- ~afor each a £ Hb(7T); 

2. a rule _L <— ~A, for each rule A -t— B in II; 

3. a rule A* U {u} <— B* for each rule A ■£- B in II; 

4. a rule d\ V • • • V d n V u; 

5. rules u ■£- di,~a,i and u £- a*, ~ a; for each 1 < i < n; 

6. rules u ■£- d;, a* , and u V d; V a* 4— ~difor each 1 < i < n; 

7. a rule a* <— ufor each a £ Hb(77); 

8. a rule di <— ufor each 1 < i < n; and 

9. a rule _L £- ~u. 

The rules included in TiT{lind( 1 1 ) serve the following purposes. (1.) An arbitrary 
interpretation AI C Hb( II) is chosen for the PDLP II. (2.) It is ensured that A/ \= II 
holds in the classical sense. (3.) A renamed copy of II is created to check the (V, 0)- 
minimality of M. In analogy to [13], this can be achieved by checking whether 

Tr UNSA T(fT, P, M) = n U {_L <- P n M} U {i a \ a £ P \ M} (3) 

is unsatisfiable for M and the set of atoms P = {oi, . . . , a.,, } subject to minimization. 
This is why the intuitive reading of u is unsatisfiable which captures the desired state of 
affairs, implying the ( V , 0)-minimality of A I. (4.) The disjunction d\ V • • • V d n captures 
the rule _L -t— P fl M from (3). This rule depends dynamically on A I and it effectively 
states the falsity of at least one atom a,; that is both subject to minimization (a,: £ P) and 
true in AI ( Oj £ AI). (5.) The rules cover the case that a.j is false in AI, i.e. a, £ P\ AI. 
Conforming to (3), both d, and a* are implicitly assigned to false, as they imply it. 
Otherwise, a ; is true in M which activates the rules in (6.) enforcing d; equivalent to 
the negation of a*. The net effect of the rules included in (4.) - (6.) is that any potential 
counter-model N \= II for the (V, 0)-minimality of AI, expressed in Hb(77)* rather 
than Hb(i7), must satisfy N fl P C M fl P ( 4=> N\V C A f \ V). 

The rules given in items (7.) - (9.) are directly related to the unsatisfiability check 
which effectively proves that counter-models like N above do not exist. To implement 
the test for unsatisfiability, we adopt the technique used earlier by Eiter and Gottlob [5]. 




142 



T. Janhunen and E. Oikarinen 



Example 2. Consider a program 77 = {/ V ab. } which is a simplified version of 
Lifschitz’s ostrich example [16]. This program has a unique ({/}, 0) -minimal model 
717 = {/}. The translation TrBLiND(Tf) includes the following rules: (1.) / 4— ~/. 

/ 4— ~/. ab 4— ~a&. ab 4— ~a&. (2.) _L ■*— ~/, ~a&. (3.) f* V a6* V u. (4.) dV u. 
(5.) u -t— d, ~ab. u ■*— a&*,~a6. (6.) u <— d,ab* ,~ab. uV <7 V a&* t— (7.) 
a6* -t— u. f* 4— u. (8.) d 4— u. (9.) _L •<— There is only one stable model for 
TrBLiND(T7), i.e, TV = {/, a6, /* , ab* , 7, u} for which 717 = TV IT {/, a&} holds. □ 

Our next objective is to establish that the translation function TrBLiND given in 
Definition 6 is faithful, i.e. the (V. 0) -minimal models of a PDLP 77 are in a bijective 
relationship with the stable models of TrBLiND (77). In analogy to [19], we implement 
the test for (V, 0)-minimality through propositional unsatisfiability. 

Lemma 1. Given a PDLP 77 and V C Hb(77), a model 1 17 C Hb(77) of 77 is ( V. 0)- 
minimal if and only i/TruNSAT(77, Hb(77) \ V, 717), as defined in (3), is unsatisfiable. 

We split the translation TrBLiND (77) in two parts using the Splitting Set Theorem [17] 
which we formulate for stable models rather than answer sets used in [17]. A splitting 
set for a DLP 77 is any set U C Hb(77) such that for every rule A 4— B, £ 77, 
if A fl U ^ 0 then A U B U C C U. The set of rules A 4— B,~C € 77 such 
that A\J B VJ C C U is the bottom of 77 relative to U, denoted by by (77). The set 
ty(77) = 77 \ by (77) is the top of 77 relative to U which can be partially evaluated 
with respect to an interpretation X C U. The result is a DLP ey(ty (77), X) defined as 
{A 4r- ( B\U),~(C\U ) | A4- B, ~C G ty(77), BC\U C X and (CnUjnX = 0}. 
Given a splitting set U for a program 77, a solution to 77 with respect to U is a pair ( X , T) 
such that (i) AT C {/ is a stable model of by (77) and (ii) Y C Hb(77) \ U is a stable 
model of ey(ty (77), 76). Solutions and stable models relate as follows. 

Theorem 4 (Splitting Set Theorem [17]). Let U be a splitting set for a DLP 77 and 
M C Hb(77) on interpretation. Then M £ SM(77) if and only if the pair ( X,Y ) with 
X = 717 fl U and Y = M \ U is a solution to 77 with respect to U. 

We use the set of atoms U = Hb(77) U {a | a £ Hb(77)} to split TrBLiND(!7): the 
bottom by (TrBLiND ( 77 ) ) consists of items 1 and 2 in Definition 6, whereas the partially 
evaluated top ey(ty(TrBLiND(77)), X) consists of items 3, 4 and 7-9 in Definition 6 
as such and the following rules corresponding to rules in items 5 and 6: 

5. ’ u ■£- di and u 4— a* where 1 < i < n and a,; £ P \ X; and 

6. ’ u £- di, a* and u V di V a* where 1 < i < n and ai £ P fl X. 

Thus Hb(ey (ty(Tr B LiND(17)), -^)) = {«’ | « £ Bb(II)} U {di \ 1 < i < n} U {m}. 
We use the notation Ey(77, 76) = ey(ty(TrBLiND(f^)), 76) for the sake of brevity. 

It is shown next that there is one-to-one correspondence between the models in 
SM(by(TrBLiND(7?))) and CM(77). As a consequence, the stable models of the bottom 
by(TrBLiND(77)) are classical models of 77 extended to Hb(by(TrBLiND(T7))). 

Proposition 1. Let 77 be a PDLP. 

The function Exty : CM(77) -4- 2 Hb ( bt/ ^ Tl ' BLIND ^ 77 ^^ defined by Ex1b(A7) = 
M U (a | a £ Hb(77) \ 717} is a bijection from CM(77) to SM(by (TrBLiND(TT))). 




Capturing Parallel Circumscription with Disjunctive Logic Programs 143 



Proof. It is shown below that (i) the image of CM(77) under Exte is a subset of 
SM(by(TrBLiND(^7))), (ii) Exte is an injection, and (iii) Exte is a surjection. 

(i) Assume that 717 £ CM(IT), i.e. 717 f= 77. It is clear that X \= b[/(TrBLiND(77)) 
holds for X = Ext b (717) and it suffices to prove X £ MM(b[/(TrBLiND(77)) A )- 
Since 717 |= 77, the reduct by(TrBLiND(^7)) X contains only the rules a £- for 
a X and a ■£- for a X. Thus X £ MM(b[/ (TrBLiND(77)) A ). 

(ii) If Mi ^ M 2 , then Ext b (-M i) 7^ ExtB(7172) follows by the definition of Ext b. 

(iii) Consider any X £ SM(b[/(TrBLiND(77)))- We need to show that there is M £ 

CM(77) such that ExtB(717) = X. Let us establish first that M f= 77 holds for 
M = X n Hb(77). Since X £ SM(b c/ (Tr B LiND(-ff))) and bu(Tr BLmD (II)) 
contains the rules a £- and a 3— ~o for each a £ Hb (77), it holds for every 
a £ Hb(77) that a ^ X 4=> a £ X. Moreover, since X |= b[/(TrBLiND(77)), 
we obtain X U ~P for all rules A ■£- B £ 77. Thus for each rule A ■£- B 

in 77, there is a £ A such that a £ X, or b £ B such that b £ X ( 4=> b (7 X). 
In either case, M j= A ■£- B and therefore M (= 77, i.e. 717 £ CM(77). It 
remains to establish that ExtB(7V7) = X. Since M = X (T Hb(77), we have 
Ext B (717) = Ext B (X (T Hb(77)) = (X (T Hb(77)) U {a \ a £ Hb(77) \ X}. 
Then ExtB(7V7) = X follows by the fact that a ^ X a £ X holds for any 

a £ Hb(77). □ 

Finally, we show the connection between SM(E[/(77, ExtB(-M))) ^ 0 and the 
unsatisfiability of Tr unsat (77 P, M). A similar unsatisfiability check is used in [5], 

Proposition 2. Let 77, V, and P = { u -\ , . . . , «,, } be defined as in Definition 6 and 
Exts as in Proposition 1. Moreover, let M C Hb(77) be a classical model of 77. 
Then (i) if N £ SM(Ej/(77, ExtB(7V7))), then N = Hb(E^(77, ExtB(-M))), and (ii) 
TruNSAT (77, P, 717) is unsatisfiable if and only //’Ey (77, Exts (717) ) has a stable model. 

Proof, (i) Assume that TV £ SM(Ej/(77, ExtB(7V7))). Since TV \= E[/(77, ExtB(7V7)) 
and the rule _L 3 — belongs to E{/(77, ExtB(M)), we must have it £ TV. Furthermore, 
since the rules a* ■£- u (for all a £ Hb(77)) and di <— u (for all 1 < i < n ) belong to 
Ey (77, Ext B (717)), it follows that TV = Hb(E {7 (77,Ext B (7l7))). 

(ii) “=>” Assume that TruNSAT (71, P, 717) is unsatisfiable. It is easy to see that 
TV |= E £/(77, ExtB(7V7)) holds for TV = Hb(Ej/(77, ExtB(TV7))). Let us then show that 
TV £ MM(E c/ (P,Ext B (M)) Ar ) by assuming the opposite, i.e. there is TV' C TV such 
that TV' |= E u{n, ExtB(T17)) Af . Let us then assume u N' and define an interpretation 
M' = {a£ Hb(77) | a* £ TV'}. The following observations can be made, 

- We have N' j= A* <- B* for each rule A<- B£ II. Thus M' |= 77. 

- Since TV' | = u <— di, a* and TV' f= u V di V a* for all a,; £ P (T Exts (717) = Pn 717, 
it holds di £ TV' 4=> a* TV' for all cti £ P (T 717. Also TV' |= d\ V • • • V d n and 
N' |= u £- d, for all a, £ P \ Exts (717) = P \ 717. Thus there is a* £ P (T M such 
that di £ TV' and a* TV', too. This implies a, M' and M' |= _L 4 — P n 717. 

- Since TV' |= u £- a* for all aj £ P\ 717, we have a* £ TV' for all a,: £ P \ 717. This 
implies 717' (= {_L 3 — a | a £ P \ 717}. 




144 



T. Janhunen and E. Oikarinen 



Thus M' |= Tiunsat (-^5 P, 7V7) which is a contradiction so that u £ N' must be 
the case. Since u £ N' and the rules a* ■£- u (for all a £ Hb(77)) and di £- u (for 
all 1 < i < n) belong to E[/(77, Ext^TVT))^, we must have that a* £ N' for all 
a £ Hb(77) and di £ N' for 1 < i < n. Thus N' = N contradicting our previous 
assumption. Therefore N £ SM(Ej/(77, ExtB(M))) is necessarily the case. 

(ii) “<t=” Consider any N £ SM(E[/(77, ExtB(Af)))- It follows by (i) that N = 
Hb(E[/(7T, ExtB(IW)))- Let us then assume that TruNSAT(77, P, M) is satisfiable, i.e. 
there is M' C Hb(77) such that M' |= II, M' P (~l M and a £ M' for all a £ P\ 717. 

It is established in the sequel that TV' \= E;y(77, Exte (M)) N holds for the interpretation 
N' defined as (M')* U {di | a* £ M (TP and a.j ^ M'}. 

- Since M' |= 77, we have TV' (= A* U {u} -t— B* for each rule A ■£- B £ 77 . 

- Since M' Y= P IT M, there is <7,; £ N' and thus TV' |= d\ V • • • V d n V u. 

- The definition of N' implies a* ^ N' and di N' for all a,; £ P \ M, as a,; ^ M' 

for all at £ P \ M. Thus N r \= u £- di and N' \= u ■£- a* when £ P \ M. 

- Given a* £ P (T M, we have di £ N 1 4=> a-i M', i.e. a* N 1 by the definition 
of N' . Thus N' \= u ■£- di,a* and N' f= u V di V a* hold whenever di £ P fl M. 

- Since u ^ N', we have TV' (= a* ■£- u for each a £ Hb(77). 

- Since u N 1 , it follows that TV' |= di ■£- u for each 1 < i < n. 

Now N 1 C N and TV' |= E[/(77, ExtB(M)) Ar , contradicting the assumption N £ 
SM(E[/(77, ExtB(A7jj). Thus TruNSAT(P: P, 717) must be unsatisfiable, □ 

We let V denote the class of DLPs under the stable model semantics [8,20]. The 
translation function TrBLiND : Pm V — > 2? is clearly linear. Assuming that the visible 
Herbrand base Hb v (TrBLiND(I7)) = Hb v (77) by definition, the faithfulness of trans- 
lation TrBLiND (77) follows by Theorem 4 from Lemma 1, and Propositions 1 and 2. 

Theorem 5. 22+ v < PF D. 



5 Discussion 

The main result of this paper is a linear translation from parallel circumscription into 
disjunctive logic programs such that a bijective correspondence between the (V) F)- 
minimal models of a PDLP II and the stable models of the respective translation 
TrBLiND(TrKK(I7)) is obtained. As suggested by the analysis performed in Section 
3, the translation function TrBLiND is non-modular — reflecting the global nature of 
varying atoms. In contrast to earlier attempts [7,22,14], our translation does not depend 
on syntactic restrictions and it has a linear time/space complexity. Cadoli et al. [2] achieve 
the same complexity, but their transformation has ordinary circumscription as the target 
formalism, and hence a bijective relationship of models cannot be obtained. However, 
the translation function TrBLiND presented in this paper exploits default negation in 
order to establish faithfulness in the strict sense implied by Definitions 5 and 2. 

Our results enable the systematic use of varying atoms in order to develop more com- 
pact formulations of problems as disjunctive logic programs. A good example in this 
respect is the consistency-based diagnosis of digital circuits [21]. Reiter-style minimal 




Capturing Parallel Circumscription with Disjunctive Logic Programs 145 



diagnoses are hard to formalize when all atoms are subject to minimization. Following 
the ideas from [1], a digital circuit can be modeled as follows. For instance, an inverter 
I is described by a propositional theory (o/ -o- ~>ii) V abi, where the atoms ii and 
0 / model the input and the output of 7, respectively, and abj expresses the fact that 
I is operating against its specification. This theory can be equivalently formulated as 
a PDLP 77/ = {abi •<— ij, oj. ii V o/ V abi. } and minimal diagnoses correspond to 
{{ii, o/}, 0)-minimal models of 77/ augmented by observations. This line of thinking 
carries over to larger circuits which have also other gates than inverters as their compo- 
nents. Assuming the availability of varying atoms, the description of the circuit can be 
formed in a very modular' fashion, component-by-component. Then the description can 
be translated into a valid input for disjunctive solvers like dlv [15] and GnT [13] using 
the translation function TrBLiND ■ On the other hand, we run into severe problems if all 
atoms are set subject to minimization. For example, the program 77/ which models an 
inverter 7 has three (0, 0)-minimal models Mi = {//}, M 2 = {o/}, and M 3 = {abi}. 
The first two minimal models capture natural explanations given no observations on 7, 
but the third minimal model does not correspond to a Reiter-style minimal diagnosis, as 
7 is faulty according to it. Similar spurious minimal models are also obtained for more 
complex circuits encoded in this way if all atoms are subject to minimization. 

Our first experiments with large combinational circuits showed that our approach is 
not yet competitive with a special purpose engine [1] which exploits 1-fault assumption. 
The diagnosis front-end of the dlv system also covers Reiter-style minimal diagnoses [4], 
but models like the one described above are ruled out by syntactic restrictions. Moreover, 
contrary to Ti'nur/i). the translation used in the front-end yields only a many-to-one 
correspondence between stable models and diagnoses. 

As a further application of varying atoms, a specific reduction from quantified 
Boolean formulas (QBFs) to DLPs [ 13] can be improved to produce all satisfying assign- 
ments for a 2, 3-QBF 3X\/Yo given as input. Due to blind minimization, the current 
reduction does not yield a one-to-one correspondence between the satisfying assign- 
ments of 3XMY O and the stable models of the resulting DLR However, the validity of 
3 X\/Y 0 is properly captured by the reduction. 

To conclude, it might be a good idea to implement varying atoms directly in disjunc- 
tive solvers. This is a challenge, as existing algorithms [13,15] rely much on the fact that 
all atoms are subject to minimization. A further question is how varying and fixed atoms 
should be incorporated into stable models. Is it enough to consider (17, F) -minimal mod- 
els of the Gelfond-Lifschitz reduct [8] or should V and F be dynamically determined? 
Finally, we remind the reader about a reduction from prioritized circumscription to par- 
allel circumscription [16] which implies that even prioritized circumscription can be 
captured with disjunctive programs using the technique from Section 4. 



References 

1. P. Baumgartner, P. Frohlich, U. Furbach, and W. Nejdl. Semantically guided theorem proving 
for diagnosis applications. In Proceedings of the 15th International Joint Conference on 
Artificial Intelligence, pages 460-465, Nagoya, 1997. Morgan Kaufmann. 

2. M. Cadoli, T. Eiter, and G. Gottlob. An efficient method for eliminating varying predicates 
from a circumscription. Artificial Intelligence, 54(2):397 — 410, 1992. 




146 



T. Janhunen and E. Oikarinen 



3. J. de Kleer and K. Konolige. Eliminating the fixed predicates from a circumscription. Artificial 
Intelligence , 39(3):391-398, July 1989. 

4. T. Eiter, W. Faber, N. Leone, and G. Pfeifer. The diagnosis frontend of the DLV system. AI 
Communications, 12( l-2):99— 1 11, 1999. 

5. T. Eiter and G. Gottlob. On the computational cost of disjunctive logic programming: Propo- 
sitional case. Annals of Mathematics and Artificial Intelligence, 15:289-323, 1995. 

6. T. Eiter, G. Gottlob, and H. Veith. Modular logic programming and generalized quantifiers. 
In J. Dix, U. Furbach, and A. Nerode, editors, Logic Programming and Nonmonotonic Rea- 
soning, pages 289-308, Dagstuhl Castle, Germany, July 1997. Springer- Verlag. LNAI 1265. 

7. M. Gelfond and V. Lifschitz. Compiling circumscriptive theories into logic programs. In 
Proceedings of the 7th National Conference on Artificial Intelligence, pages 455-449, St. 
Paul, MN, August 1988. AAAI Press. 

8. M. Gelfond and V. Lifschitz. Classical negation in logic programs and disjunctive databases. 
New Generation Computing, 9:365-385, 1991. 

9. K. Inoue and C. Sakama. Negation as failure in the head. Journal of Logic Programming, 
35(1):39— 78, 1998. 

10. T. Janhunen. On the intertranslatability of non-monotonic logics. Annals of Mathematics and 
Artificial Intelligence, 27f l-4):79 — 128, 1999. 

11. T. Janhunen. On the effect of default negation on the expressiveness of disjunctive rules. 
In T. Eiter, W. Faber, and M. Truszczy«ski, editors. Logic Programming and Nonmonotonic 
Reasoning, Proceedings of the 6th International Conference, pages 93-106, Vienna, Austria, 
September 2001. Springer- Verlag. LNAI 2173. 

12. T. Janhunen. Translatability and intranslatability results for certain classes of logic programs. 
Series A: Research report 82, Helsinki University of Technology, Laboratory for Theoretical 
Computer Science. Espoo, Finland, November 2003. 

13. T. Janhunen, I. Niemela, D. Seipel, P. Simons, and J.-H. You. Unfolding partiality and 
disjunctions in stable model semantics. ACM Transactions on Computational Logic, 2004. 
Accepted for publication, see http : //www. acm. org/tocl/accepted.html. 

14. J. Lee and F. Lin. Loop formulas for circumscription. In Proceedings of the 19th National 
Conference on Artificial Intelligence, pages 281-286, San Jose, California, July 2004. AAAI. 

15. N. Leone, G. Pfeifer, W. Faber, T. Eiter, G. Gottlob, and F. Scarcello. The DLV system for 
knowledge representation and reasoning. CoRR: cs.AI/021 1004 v2, August 2003. 

16. V. Lifschitz. Computing circumscription. In Proceedings of the 9th International Joint 
Conference on Artificial Intelligence, pages 121-127, Los Angeles, California, USA, August 
1985. Morgan Kaufmann. 

17. V. Lifschitz and H. Turner. Splitting a logic program. In Proceedings of the 11th International 
Conference on Logic Programming, pages 23-37. MIT Press, 1994. 

18. J. McCarthy. Circumscription — a form of non-monotonic reasoning. Artificial Intelligence, 
13:27-39, 1980. 

19. Ilkka Niemela. A tableau calculus for minimal model reasoning. In Analytic Tableaux and 
Related Methods: Fifth Workshop on Theorem Proving with Analytic Tableaux and Related 
Methods, pages 278-294, 1996. LNCS 1071. 

20. T.C. Przymusinski. Stable semantics for disjunctive programs. New Generation Computing, 
9:401-424, 1991. 

21. R. Reiter. A theory of diagnosis from first principles. Artificial Intelligence, 32:57-95, 1987. 

22. C. Sakama and K. Inoue, Embedding circumscriptive theories in general disjunctive programs. 
In Proceedings of the 3rd International Conference on Logic Programming and Nonmonotonic 
Reasoning, pages 344-357. Springer- Verlag, 1995. 

23. P. Simons. Extending the stable model semantics with more expressive rules. In Proceedings 
of the 5th International Conference on Logic Programming and Nonmonotonic Reasoning, 
pages 305-316, El Paso, Texas, USA, 1999. Springer- Verlag. 




Towards a First Order Equilibrium Logic for 
Nonmonotonic Reasoning 



David Pearce 1 * and Agustfn Valvercle 2 ** 



1 Universidad Rey Juan Carlos (Madrid, Spain) 
d.pearce@escet .urj c . es 
2 Universidad de Malaga (Malaga, Spain) 
a_valverde@ctima.uma. es 



Abstract. Equilibrium logic, introduced in [20], is a conservative ex- 
tension of answer set semantics for logic programs to the full language 
of propositional logic. In this paper we initiate the study of first-order 
variants of equilibrium logic. In particular, we focus on a quantified ver- 
sion QN 5 of the propositional many-valued logic Ns of here-and-there 
with strong negation, and define the condition of equilibrium via a mini- 
mal model construction. We verify Skolem forms and Herbrand theorems 
for QN 5 and show that, like its propositional counterpart, the quanti- 
fied version of equilibrium logic also conservatively extends answer set 
semantics. 



1 Introduction 

Equilibrium logic, introduced in [20], is a general purpose propositional formal- 
ism for nonmonotonic reasoning with two kinds of negation: strong negation, rep- 
resenting explicit falsity, and weak or intuitionistic negation which allows for the 
expression of default relationships. One of the main features of equilibrium logic 
is that, under all the usual classes of logic programs, it is equivalent to reasoning 
under answer set semantics and therefore amounts to a conservative extension 
of answer set inference to the full propositional language. With the emergence 
of answer set solvers such as dlv [15], GnT [13], and smodels [29], answer set pro- 
gramming (ASP) now provides a practical and viable environment for knowledge 
representation and declarative problem solving. AI applications include planning 
and diagnosis [2], the management of heterogenous data in information systems, 1 
the representation of ontologies in the semantic web [4] , as well as compact and 
fully declarative representations of hard combinatorial problems. 2 . 

Compared to ASP systems, equilibrium logic is much less well-developed as 
a practical knowledge representation tool. Nevertheless it can be implemented 

* Partially supported by CICyT project TIC-2003-9001-C02, URJC project PPR- 
2003-39 and WASP (IST-2001-37004) 

** Partially supported by CICyT project TIC-2003-9001-C01 and Junta de Andalucfa 
project TIC-115. 

1 see the INFOMIX project http://sv.mat.unical.it/infomix/ 

2 For examples and a thorough introduction to ASP, see [3] 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, pp. 147—160, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




148 



D. Pearce and A. Val verde 



in different ways. For example a reduction to quantified boolean formulas has 
been presented, allowing for an implementation in QBF-based systems such 
as QUIP, [26] . In [23,24] proof systems for equilibrium logic are given which 
form the basis for a prototype implementation currently being developed at 
the University of Malaga. The paper [25] presents a polynomial translation of 
a restricted class of theories in equilibrium logic, called nested programs, into 
disjunctive logic programs and describes an implementation extending dlv: here 
equilibrium logic is equivalent to the answer set semantics for nested programs 
described in [17] (though it predates the latter). Aside from its potential as 
a knowledge representation formalism, equilibrium logic has already proved 
useful in the study of the logical and mathematical foundations of ASP. For 
example, it provided the basis for characterising the strong equivalence of logic 
programs in [16] and the uniform equivalence of programs in [27]. Recently 
it has been used to characterise synonymous theories [28] and to develop and 
study transformations that preserve the strong equivalence of logic programs 
and allow for program simplification in the setting of ASP [22] . 

Our aim this paper is to initiate the study of first-order versions of equi- 
librium logic. It is far from obvious that there should be a unique, natural, 
quantified version of equilibrium logic. In searching for suitable candidates we 
shall be guided by two main methodological considerations. The first is coher- 
ence with respect to the logical features present in the propositional case. That 
is, bearing in mind the underlying monotonic base logic and the minimal model 
construction that defines equilibrium, we shall be looking for suitable first-order 
extensions of the base logic and ways to preserve the central idea behind the 
construction of intended models. The second consideration concerns answer set 
programming. Currently, answer set solvers implement a grounding procedure 
to eliminate free variables prior to model generation and testing. In the propo- 
sitional case, therefore, equilibrium logic agrees with or ‘captures’ answer set 
semantics for ground programs. In the first-order case, we would like to main- 
tain this relation for logic programs and ultimately obtain logical methods for 
analysing and simplifying programs prior to grounding. 

In the propositional case equilibrium logic is based on the nonclassical logic 
N ,5 of here-and-tlrere with strong negation. To our knowledge, first-order ver- 
sions of N. 5 , or even of the logic of lrere-and-there, have not previously been 
studied in the literature. Thus, a good deal of our work in this paper is devoted 
to considering different first-order versions of N 5 and selecting a candidate to 
form the basis for first-order equilibrium logic. We present this logic in §2 and 
call it QN 5 . It can be equivalently represented as a 5-valued logic or as the logic 
of rooted linear Kripke frames with two elements (‘here’ and ‘there’) having con- 
stant domains. It permits a straightforward definition of the equilibrium model 
construction and appears to be adequate as a tool for applications in ASP. Since 
QN 5 is something of an unknown on the logical landscape, we devote space in 
§3 to examining some of its basic properties. We look at prenex normal forms, 
Skolem forms and establish Herbrand theorems in various guises. The remainder 
of the paper is then organised as follows. In §4 we turn to the proof theory of 




Towards a First Order Equilibrium Logic for Nonmonotonic Reasoning 149 



QN 5 , sketching a sound and complete tableaux calculus. In §5 we provide the 
minimal model construction that defines equilibrium logic in the first-order case. 
Then we show that it satisfies the main criteria of adequacy that we mentioned 
informally above. More precisely: (i) on universal theories the selected models co- 
incide with the propositional equilibrium models of the theories’ ground versions; 
(ii) for logic programs the Herbrand equilibrium models coincide with answer 
sets. Some consequences of this are briefly discussed and in §6 we conclude by 
considering related work and some of the main issues to be tackled in the future. 

2 First-Order N 5 Logics 

For the propositional version of equilibrium logic the reader is referred to [20, 
21,24,16]. It is based on the the propositional logic N 5 of lrere-and-there with 
strong negation that contains the logical constants: A, V, — A -i, ~, standing 
respectively for conjunction, disjunction, implication, weak (or intuitionistic) 
negation and strong negation. Presented as a Hilbert-style axiomatic system, 
the axioms and rules of inference for N5 are those of intuitionistic logic (see eg 
[31]) together with: 

1. the axiom schema of Lukasiewicz [18] 

(-'« —►/?)—>• (((/? — > a) — > (3) — > (3) (1) 

which characterises the 3-valued here-and-there logic of Heyting [12] and 
Godel [8] (hence it is sometimes known as Godel’s 3- valued logic). 

2. the following axiom schemata involving strong negation taken from the cal- 
culus of Vorob’ev [32,33] (where ‘a -O- /?’ abbreviates (a — > (3) A (/3 — > a)): 



Nl. - 


^ (<a — y (3) -O' (x A r 




N2. ~(a A (3) F4 ~aV 


~/3 


N3. - 


^(a V 0) -O- A ' 


-P 


N4. ~ F4 a 




N5. - 


-o- a 




N6. (for atomic a) ~ 


■'a — > *1 a 



As one can see, there are three basic components to this picture: intuitionistic 
logic, the Lukasiewicz axiom (1) and the Vorob’ev axioms for strong negation. 
The last of these components can be regarded as essentially a defining charac- 
teristic of strong negation and should be preserved in any quantified version. 

One, straightforward way to obtain quantified N5 is therefore to take Nel- 
son’s constructive predicate logic with strong negation [19,9] and simply add 
the Lukasiewicz axiom. This would amount to the following system: (i) axioms 
and rules of first-order intuitionistic logic; (ii) the axiom schema (1); (iii) the 
Vorob’ev axioms augmented with the following two schemata: 

~3 xa Vx~a ~Va;a <-> 3 x~a (2) 

A second approach to obtaining a quantified version of N 5 may be called the 
semantical approach. Nelson’s constructive logic has a natural and appealing 
characterisation in terms of Kripke models. This is the one that gives rise to the 




150 



D. Pearce and A. Valverde 



term lrere-and-there. The idea is to take the usual Kripke model semantics for 
intuitionistic logic but to allow for sentences to be not only constructively ver- 
ified at possible worlds or stages of the model, but also constructively falsified 
(equivalently their strong negations are verified), see [9]; in addition one re- 
stricts attention to 2-element, here-and-tlrere frames. This leads to the following 
semantics that we denote by FOHT. 3 

We consider a first order language built over a set of constants , C, a set of 
functions, T , and a set of predicates, V. We denote by Term(C,.P) the set of 
ground terms defined from C and T\ we denote by Atom (C,T,V) the set of 
ground atomic formulas defined from C , T and V in the usual way; we denote 
by Lit )C,T,V) the set of ground literals, that is, either ground atomic formulas 
or the strong negation of ground atomic formulas. If L is an atom, we say that 

is the contrary of L and vice versa. We will use the usual notions of free and 
bound variable, but we essentially only work with closed formulas or sentences. 

An FOHT-model is a quadruple M. = {Dh, H, D t , T) such that: Dh and D t 

are non-empty sets such that C C Dh C D t \ H and T are sets of literals in 

Lit {D t ,T,V) such that H CT,T does not contain contrary literals and H does 
not contain constants from D t \ Dh- We shall define the satisfaction relation 
|= for “worlds” to £ {h,t} where h < h, t < t and h < t (we use the following 
notation: Th = Term (D h ,tF) and 7) = Term(D t , P)): 

— For every literal L : M.,h \= L \?i L £ H and M. ,t \= L L £ T . 

— M,lo |= ip A ip iff M,lo |= ip and M,oj |= if. 

— M,co \= ipV if M,u> \= ip or M,to \= if. 

— M,u> (= 9 ? — ^ iff for every to' > to, if M, to' |= ip then M,to’\=if. 

— M,to |= -• ip iff for no to’ > to, M,u’ |= ip. 

— M,to f= VxA(x) iff for every to’ > to and every d £ T u >, M,to’ \= A{d). 

— M,to \= 3 xA(x) iff M,to |= A(d) for some d £ T w . 

— M,lo |= ~(</? A if) iff M, to \= ~ip or M, to \= ~if. 

— M,lo \= ~(<^ V if) iff M,to |= and M,to \= ~if. 

— M,to |= ~(i£ —> if) iff M, to \= p and M, to |= ~if. 

— M,to |= ~-i if iff M,to |= ip. 

— M,to (= iff M,to |= ip. 

— M,to |= ~\xA(x) iff M, to |= ~A(d) for some d £%,■ 

— M,to |= xA{x) iff for every to 1 > to and every d £ 7^', M,to' |= ~A(d). 

Truth of a formula in a model is defined as follows: for every formula A, 

M. \= A M.,h \= A and Ai, t (= A. A formula is valid in FOHT if it is true 

in all models. 

If we add the assumption of constant domains , namely that in each model 
Dh = D t , a stronger version of FOHT is obtained; for example, in the general 
case the formula Mx(A(x) V B) — > (VxA(a:) V B) ( B is closed) is not valid, as the 
following counter-model shows: ({a}, {P(a)}, {a, 6 }, {P(a), J5}). However in all 
models such that Dh = D t the formula Vx(A(x) V B) — > (\/xA(x) V B) holds. We 

3 It is still an open question whether this second approach is equivalent to the first. 

We hope to clarify the matter in a future study. 




Towards a First Order Equilibrium Logic for Nonmonotonic Reasoning 151 



denote by FOHT c the logic determined by constant domain models of the form 
(D, H, D,T); we denote them simply by (D,H,T) and D is called the domain 
of the model. 

2.1 Five- Valued Semantics 

A third approach to obtaining a first-order version of N 5 is also semantical. 
In the propositional case the Kripke semantics is easily characterised using a 
five-valued matrix: the set of truth values is 5 = {— 2, — 1, 0, 1, 2} and 2 is the 
designated value; the connectives are interpreted as follows: A is the minimum 
function, V is the maximum function, = —x, 

{ 2 if either x < 0 or x < y ( 2 if a: < 0 

and -ix = < 

y otherwise I —x otherwise 

If we add quantifiers using the standard approach for many- valued logics, we ob- 
tain a semantics which we can denote by QN 5 as follows: a model is a pair ( D , a) 
where D D C is non-empty and called the domain and a : Atom(U, T , V) — > 5 is 
the assignment. If T = Term(.D, T) the model is extended to closed quantified 
formulas in the following way: 

a(\/xA(x)) = min{<j(A(t)); t £ T} a(3xA(x)) = max{cr(A(t)); t £ T} 

In the propositional case an Ns-model a as a truth- value assignment can trivially 
be converted into a Kripke model ( H , T), and vice versa with the conversion rules 
shown in the following table: 



Table 1. 

( <r (p) = 2 iff p £ H a(p) = —1 iff £ T , ^ H 

< cr(p) = 1 iff p £ T,p H a(p) = —2 iff £ H 

[ a (p) =0 iff p £ T, ~p ^ T 

In the first-order case, however, the Kripke and the many-valued semantics are 
not equivalent. Since in any many- valued logic the quantifiers are interpreted as 
supremum and infimum, it follows that the formulas \/x(A(x)\/ B) and \/xA(x)V B 
are equivalent, which we have seen is not the case for the Kripke semantics. 
However, there is full equivalence with respect to the logic of constant domains. 

Theorem 1. There is a bijection f between FOHT c -models and QN ^-models 
such that for any formula A and FOHT c -model A4, M. (= A iff f(M)(A) = 2. 
Thus in particular a formula A is valid in FOHT c if and only if is valid in 

QN 5 . 

Proof. The bijection is established by the conversion rules in table 1 applied to 
ground atoms; by induction is easy to prove that these rules are also valid for 
any ground formula and that allows us to conclude the result. □ 




152 



D. Pearce and A. Val verde 



In other words we can equivalently work with Kripke models having constant 
domains or with the five-valued semantics. In what follows we alternate freely 
between these two representations, depending on which version is simpler or 
more intuitive for the task at hand. 



3 Some Metatheory for QN 5 

For the remainder of this paper we are going to explore QN 5 or the equivalent 
constant domain logic as our basis for defining a quantified version of equilib- 
rium logic. There are several reasons for this choice. First, as we shall see, it is 
easy to check that QN 5 possesses several properties that are desirable from the 
perspective of automated deduction. Indeed, as it is obtained from the many- 
valued propositional logic by standard means, we can in some cases make use of 
well-known techniques and methods from many- valued logic to prove properties 
of QN 5 and to describe a complete proof-theory. Secondly, as we shall show in 
Theorem 8, the many-valued approach is adequate as a first step towards an 
extension of answer set semantics, and any other generalisations should agree 
with it. Further extensions should analyse the intended meaning of quantifiers 
in general first order logic programs, a topic we postpone for future work. 



3.1 Prenex and Skolem Forms 



Theorem 2. In QN^ the following equivalences hold for any closed formula C: 



Va;A(a:) A C = Vx(A(x) A C) 
Va;A(a:) V C = Va;(A(a:) V C) 
3xA(x) ~^C = Vx(A(x) -)• C) 
C — > VxA(x) = Mx{C — > A{x)) 
SxA{x) = 'ix~>A{x) 



3 xA(x) A C = 3x(A(x) A C) 

3 xA(x) V C = 3 x(A(x) V C) 
VxAl(x) ~^C= 3x(A(x) -a C) 
C -»• 3xA(x) = 3 x(C -> A(x)) 
-iVxA(x) = 3x~i A(x) 



Proof. The properties of conjunction and disjunction are a consequence of the 
associative and distributive properties of the operators max and min over finite 
sets. Additionally, implication is decreasing in the first argument and increasing 
in the second one and weak negation is decreasing; thus, the monotony of max 
and min allows us to conclude the last equivalences. □ 

As a consequence we have: 

Corollary 1. In QN 5 every formula is equivalent to a formida in prenex nor- 
mal form, ie of the form Q\X\ . . . Q n x n A(x \, . . . , x n ), where Qi € {V, 3} and 
A(x i, . . . , x n ) is a quantifier-free formida. 

Next we turn to satisfiability preserving Skolemization and the associated Her- 
brancl theorem. These results proceed by adding new constants and functions 
to the language, and for this reason we specify the signature of the logic: 
QN 5 (C, T, V) denotes the logic over the language with signature 

Lemma 1. 1. 3yA(y) is satisfiable in QN S (C, T, V) if and only if A(a) is sat- 

isfiable in QN 5 (C U {a}, , where a ^ C. 




Towards a First Order Equilibrium Logic for Nonmonotonic Reasoning 153 



2. Mx\ . . .\/x n 3yA(y,Xi, . . . ,x n ) is satisfiable in QN 5 (C,tF,V) if and only if the 
formula Vxi . . . Vx n A(f(x i, . . . , x n ),xi , . . . , x n ) is satisfiable in QN 5 (C, T U 
where f (jL T 

Proof. We demonstrate only item 2, the other one is similar. Let {D,a) be a 
model of Vxi . . . \/x n 3yA(y, x\, . . . ,x n ) in QN 5 (C, T ,V) and T = Term(.D,.F); 
then 

2 = cr(yx! . . .Vx„3yA(y,Xi, . . . ,x n )) = min (maxcr(A(t, ti, . . . , f„)) 

and thus max t6 7 - cr(A(t, t\ , . . . , t n )) = 2 for all t\ , . . . , t n € T; so we can define the 
operator <L : Atom(Z?, tFLi{f}, V) — > Atom(D, T , V) (and this can be extended to 
general formulas) that works by replacing recursively every term f(t i, . . . ,t n ) by 
a term t such that cr(A{t, t\, . . . , t n )) = 2. Let a be an assignment in QN 5 (C, TA 
{/}, V) defined by a(L) = cr(<£/(L)); it is easy to prove by induction that a(B) = 
a(<Lf(B)) for every closed formula B and so, if T' = Term(Z), T A {/}): 

a(Vxi . . . Mx n A(f(x i , . . . ,x n ),xi,. . . , x n )) 

= min v(&{A{f{t i, . . . , t n ),ti ,. . . , f„))) 

> min cr(A(£(/(fi,...,f„)),fi,...,f„)) = 2 

Conversely, let ( D , a) be a model of Vxi . . . Vx n A(f( X \, . . . , x n ), xi , . . . , x n ): 
min cr(A(/(ti, . . . , t n ), ti, . . . ,t n )) = 2 

tl 7 ' 

and let <P: D' — > T n be a bisection, where D' fl D = 0, and its extension to the 
set of atoms, <P: Atom(U U D',T, V) — > Atom(H,^ r U {f},V) replacing every 
c £ D' by f(<h(c)). Then the model ( D U D ' , r) is defined by r(L) = a(<P(L)).0 

Definition 1. Let A be a formula, Ca the set of constants in A and Ta the set 
of functions in A. The Herbrand models of A are the models in QN^C^, Ta, V) 
with domain Ca- Ha = Term (Ca,Ta) is called the Herbrand universe of A. 



Lemma 2. If A(x i,...,x n ) is quantifier- free, then B = 

Vaq . . . Vx n A(x \, . . . , x n ) is satisfiable iff it has a Herbrand model. 

Proof. If (D,( t) is a model of B then H B = Term (Cb,Tb) C Term (D,T) = T- 
Then the restriction r of a to H B is a model of B : 



t{B) = min r(A(t 1 , . . . ,t n )) > min a(A(ti, . . . , t n )) = 2 



□ 



Theorem 3 (Herbrand’s theorem for satisfiability). Let A be a formula 
in QN.5, then there is an algorithm for converting A into a prenex formula A*, 
with only universal quantifiers, such that- A is satisfiable iff and only if A* is 
satisfiable by an Herbrand model (of A*). 




154 



D. Pearce and A. Valverde 



Proof. First, the equivalences of Theorem 2 are applied to obtain a prenex for- 
mula, A ! , equivalent with A ; then the transformations in Lemma 3 are applied 
to eliminate every existential quantifier in the prefix of A! (from left to right) 
introducing fresh constants and functions. The resulting formula is the formula 
A* that we are looking for. Applying Lemma 2 concludes the proof. □ 

Next we turn to validity-preserving Skolemization and its associated Her- 
brand theorem. We omit the details of the proofs because they are similar to the 
previous results. 

Lemma 3. 1. \/yA(y) is valid in QN 5 (C,iF, V) if and only if for some a C, 

A(a) is valid in QN 5 (C U {a},T, V). 

2. 3:ri . . . 3x n \yA(y, xi , . . . , x n ) is valid in QN 5 (C, IF, V) if and only if the for- 
mula 3xi . . . 3x n A(f(xi , . . . , x n ),x \, . . . , x n ) is valid in QN 5 (C, T\J {/}, V), 
where f ^ T 

Lemma 4. If A(x i, . . . , x n ) is quantifier free, then @ = 3xi . . . 3x n A(xi , . . . , x n ) 
is valid iff it is true in every Herbrand model. 



Theorem 4 (Herbrand’s theorem for validity). Let A be a formula in 
QN 5 ; then there is an algorithm to convert A into a prenex formula A* , with 
only existential quantifiers, such that A is valid if and only if A* is true in every 
Herbrand model (of A*). 

4 Tableaux System for QN 5 

The many-valued semantics allows us to describe a tableaux system for QN 5 
based on signed formulas; in [24] a propositional tableaux calculus was intro- 
duced and will be extended here to the quantified version. The nodes in the 
tableaux are closed formulas labelled with a set of truth values, S:q) (this con- 
struction is called a signed formula). In fact, we only need the following signs: 



Table 2. Tableaux expansion rules in N 5 for — K V and 3. For A, V, ~ and - 1 , the 
standard expansion rules for regular connectives are applied. 



if) 



9- 

3 

VI 


{2 yip 


{<!}:<£ 






{>ihV> 



{> 0}:^3 ->• Ip 

{<0}:y){>0}:l/> 

{<i}3x<p(x) 

{ ) 



{<0}:V> {<iy.1p 



{<-1 Y-F -> ip 
{>1 Y-V> 

{<-1 y.tp 

{>i}:3xtp(x) 

{ >i}:(p(d ) 



{>1 YV -> 4> 
{>-1 }:<F Ip 

{ >i}h/X(p(x ) 

{ >iy<P{t ) 



{< 0 yip —> ip 
{>1 YV 
{<0}:V> 

{-2}=y> -> ip 
{>1 YV 
{-2 ytp 

{ <iy3/x<p(x ) 
{<iy.ip(d) 



where d is fresh parameter and t is any term. 





Towards a First Order Equilibrium Logic for Nonmonotonic Reasoning 155 



{< i} = {j e 5 | j < i}, {> i} = {j e 5 | j > i} (we abbreviate {—2} = {< —2} 
and {2} = {> 2}). 

As usual, we define recursively the concept of tableau: an initial tableau 
is defined and then expansion rules are provided to generate further tableaux. 
More details on tableaux systems for many-valued logics and a general method 
for proving their soundness and completeness can be found in [10]. 

Definition 2. Let II = {tpi, . . . , ip n j be a set of formulas and if a formula. 

! {2}:^l 

is called the initial tableau for (II, xb). 

2. If ‘I is a tableau for (II, ip) and 'V is the tree obtained from T applying one 
of the expansion rules in Table 2, then 'V is a tableau for (II, if). 

3. A branch B in a tableau T is called closed if it contains a variable p with 
two signs, S:p, S':p, such that S fl S' = 0 . 
f. A tableau T is called closed if every branch is closed. 

Theorem 5 (Soundness and completeness of the tableaux system). 

The inference ipi,... ,ip n | = ip is valid if and only if there exists a closed tableau 
for ({v?i,-. 



5 Equilibrium in Quantified N 5 

In the propositional case equilibrium logic is most easily characterised by a simple 
minimal model condition on N 5 Kripke models. If we mirror this condition in the 
quantified case, we are led to consider a partial ordering < on FOHT c models. 

Definition 3. Given any two FOHT c models M = ( D,H,T ) and AT = 
(. D',H’,T '), we set M <M if D = D’,T = T’ and H C H' . 



Definition 4. Let II be a set of first-order N 5 formulas and ( D,H,T ) a model 
of n. 

1. ( D , H, T) is said to be total if H = T. 

2. ( D , H, T) is said to be an equilibrium model of II if it is minimal under < 
among models of 77, and it is total. 

In other words a model ( D , H, T) of II is in equilibrium if it is total and there 
is no model ( D,H',T ) of II with H' C H. 

The same property can be equivalently expressed using the many-valued 
semantics. Let II be a set of formulas in (C,T,V). In QN 5 , the model <7 of II 
is total if cr(L) £ {—2,0,2} for all ground literal L\ and the ordering < cr 2 
among models cti and <72 of II holds iff for every literal L in \At(C,T, V) the 
following properties hold: 




156 



D. Pearce and A. Valverde 



1. (J\{L) = 0 if and only if o-i (L) = 0 . 

2. If <J\ (L) > 1, then <j\ (L) < 02 (L) 

3. If a i (L) < — 1, then crffL) > <72(7/ ) 

This yields characterisations of total model and equilibrium model, clearly 
equivalent to the earlier one. 

5.1 Equilibrium Logic and Answer Set Semantics 

We assume that the reader is familiar with answer set semantics for logic pro- 
grams as described in [7,17]. In the propositional case equilibrium logic gener- 
alises answer set semantics in the following sense. For all the usual classes of 
logic programs, including normal, disjunctive and nested programs, equilibrium 
models correspond to answer sets. The ‘translation’ from the syntax of programs 
to N5 propositional formulas is the trivial one, eg. a ground rule of a disjunctive 
program of the form 

K\ V ... V Kk Li,. . . , L m , notL m+1 , . . . , notL n 
where the L,; and Kj are literals corresponds to the N5 sentence 
Li A ... A L m A —>L m . |_i A ... A —< >L n I\\ V . . . V A& 



Theorem 6 ([20,16]). For any ground logic program 77, an N 5 model ( T,T ) 
is an equilibrium model of 77 if and only if T is an answer set of 77. 

Two propositional theories 77 and 77' are said to be logically equivalent, in sym- 
bols 77 = 77', if they have the same N5 models, and simply equivalent if they 
have the same equilibrium models. They are said to be strongly equivalent, in 
symbols 77 = s IT , if, for any S, 77 U £ is equivalent to 77' U £. An important 
property is the following. 

Theorem 7 ([16]). Any two theories 77 and 77' are strongly equivalent iff they 
are logically equivalent, ie. 77 = s 77' iff 77 = 77' . 

Let us now turn to the situation in the first-order case. We are going to 
consider universal theories. A first-order N5 theory 77 in some signature (C, T , V) 
is said to be universal if 77 is QN 5 equivalent to a set of sentences in prenex 
form all of whose quantifiers are universal. Let 77 be a universal theory (assumed 
to be presented in prenex form) in a signature (ff,T , V). Let D AC. 

We define the grounding of 77 with respect to D as g (77, D) = u g(B,D) 

Ben 

where 

g(f/x\ . . .\/x n A(xi, . . . ,x n ),D) = {A(t 1 ,...,t n );t 1 ,...,t n G Term(77,^ r )} 

Clearly, any grounding of a first-order theory 77 can be represented as a (possibly 
infinite) theory in propositional N 5 logic and so we can now relate the quantified 
version of equilibrium logic for universal theories to the propositional equilibrium 
logic of their ground versions. 




Towards a First Order Equilibrium Logic for Nonmonotonic Reasoning 157 



Lemma 5. Let II be a universal theory in (C,T,V). Then {D,H,T) \= II iff 
{H,T) is a propositional N5 -model of g{II, D). 

Proof. Assume that the sentences of II are presented in prenex form with uni- 
versal quantifiers. By the semantics for QN 5 we have that 

2 = cr(Va;i . . . Mx n A{xi, . . . ,x n )) = min{cr(A(ti, . . . ,t n )) \ ti , . . . , t n € T} 

<f=> a(A(ti, . . . , t n )) = 2 for all ti, . . . , G T™ 

Oa\= g(Vx 1 ...\/x n A(xi,...,x n ),D) □ 

From this and the definition of equilibrium model we obtain immediately: 

Theorem 8. Let II be a universal theory in ( C,T,V ) and let (D,T,T) be a 
total model of II. Then { D,T,T ) is an equilibrium model of II iff (T,T) is a 
propositional equilibrium model of g(II, D). 

Combining this property with that of Theorem 6 we can relate quantified equi- 
librium logic to the answer set semantics of logic programs. The rules of any 
logic program are interpreted as holding for all values of the free variables in 
the Herbrand universe. Hence any program (disjunctive, nested, etc) for which 
answer set semantics is defined can be regarded as the universal closure of the 
translation of the program rules into first-order N 5 . Therefore we can identify 
the equilibrium models of a logic program with the equilibrium models of the 
universal closure of 77, which is clearly a universal theory. If we make the stan- 
dard domain closure assumption, then it is natural to restrict attention to the 
Herbrand models of a program 77. Recall that by Lemma 2 every consistent 
universal theory has an Herbrand model. We obtain: 

Corollary 2. Let II be a logic program in the signature (C,T 1 'P). A total Her- 
brand model ( C,T,T ) of the universal closure of 77 is an equilibrium model of 77 
iff T is an answer set of 77. 

Proof. Immediate from Theorem 8 together with the fact that the answer sets of 
a program 77 with variables are identified with the answer sets of the grounding 
of 77 with respect to the Herbrand universe of 77. The latter coincides with 
g{n,C ) defined above. Applying Theorem 6 completes the argument. □ 

Let n 1 and 77 2 be QN 5 equivalent theories in (C,T,T). Then clearly the 
ground versions of H\ and 77 2 are equivalent in propositional N5, for any ground- 
ings. This leads to the following observation which follows from Lemma 5 and 
Theorem 7. 

Corollary 3. Let 77 1 ,77 2 be logic programs (with variables) in the signature 
(C,T,V). 7?! and 77 2 are QN 5 equivalent iff for any D A C, g(n\,D) = s 
g(n 2 ,D). 

The latter condition amounts to a kind of strong equivalence for open programs, 
for which logical equivalence in QN 5 provides a necessary and sufficient condi- 
tion. Corollaries 2 and 3 also suggest that, as hoped, the logic QN 5 may play a 




158 



D. Pearce and A. Val verde 



role not only in extending ASP but also in its implementation. In particular, any 
transformation of a program with variables that leads to an QN 5 -equivalent pro- 
gram preserves answer sets in a strong sense. This suggests that logical inference 
in QN.5 may be used as a tool for program transformation and simplification 
prior to grounding. 



6 Related Work and Concluding Remarks 

Unlike in the propositional case ([14]), the study of intermediate predicate logics 
with strong negation is largely uncharted territory. A recent exception is the 
work of [11] who investigate the constant domain Kripke semantics for certain 
special cases, including general linear frames, but not specifically treating the 
two-element frames of the here-and-there logic. The authors are interested in 
an extension of the strong negation axioms and this may be one reason why 
they are led to develop a quite complex and non-standard proof theory. In the 
literature on many-valued logics, it seems that the Godel predicate logics have 
mainly been studied in the infinite- valued or fuzzy case, see eg. [ 1 ], rather than 
in the three- valued case, even without the addition of strong negation. To our 
knowledge in none of these areas have nonmonotonic logics been previously built 
on the underlying nonclassical base logic. 

Our work in this paper is a natural development out of our previous work 
on propositional equilibrium logic. By choosing as a first step the many- valued 
semantics, we were able to extend both the model theory and the tableaux-based 
proof theory for propositional N 5 in a natural manner. We showed that the re- 
sulting first-order logic, QN 5 , enjoys several properties, including Skolem forms 
and Her brand theorems, that are important in automated reasoning. Moreover 
in this logic equilibrium models admit a very simple characterisation. As we 
saw, they bear a natural relation to the old equilibrium semantics in the propo- 
sitional case, when ground versions of universal theories are considered. In the 
same manner the new logic extends answer set semantics for all the usual kinds 
of logic programs. 

In this paper we have not attempted to provide algorithms for computing 
equilibrium models, and this remains a major challenge for the future. Among 
previous work we know of on first-order representations of stable models, the 
most important appears to be that of Eiter, Lu and Subrahmanian [5]. That 
work does provide some algorithms for computing first-order stable models and 
some discussion of implementation methodology. These are issues that we hope 
to address in the future. However, even at this early stage, there is an important 
difference in our approach compared to that of [5] : ours is anchored in a logical 
approach to stable reasoning rather than a purely operational one. Consequently, 
even if many details of the underlying logic remain open for further study, we 
can already be fairly confident that proof methods and techniques of automated 
deduction in nonclassical logics will be useful for computing and for understand- 
ing the first-order nonmonotonic systems we are interested in. Even now we have 
seen how inference in QN 5 might be applied for program simplification. 




Towards a First Order Equilibrium Logic for Nonmonotonic Reasoning 159 



Aside from work on proof theory and algorithms, we expect future research 
to tackle a range of other issues including the metatheory of QN 5 , alternative 
definitions of equilibrium, other characterisations and properties of the equilib- 
rium logic and examples of how it might be applied beyond the sphere of logic 
programs. 



References 

1. M. Baaz, A. Ciabattoni, and C.G. Fermiiller. Herbrand’s theorem for Prenex 
Godel logic and its consequences for theorem proving. In Proc. of LPAR 2001, 
LNCS 2250, pp. 201-216. Springer, 2001. 

2. M. Balduccini, M. Gelfond, R. Watson, and M. Nogueira. The USA-Advisor: A 
case study in answer set planning. In Proc. of LPNMR 2001, LNCS 2173, pp. 
439-444. Springer, 2001. 

3. C. Baral. Knowlewdge Representation, Reasoning and Declarative Problem Solving. 
Cambridge University Press, 2003. 

4. F. Calimeri, S. Galizia, M. Ruffolo, and P. Rullo. Enhancing disjunctive logic 
programming for ontology specification. In Proc. of International Joint Conference 
on Declarative Programming, AGP’03. Univ. degli Studi di Reggio Calabria, Italy, 
September 2003. 

5. T. Eiter, J. Lu, and V.S. Subrahmanian. A first-order representation of stable 
models. AI Communications, 1:53-73, 1998. 

6. M. Gelfond and V. Lifschitz. The stable model semantics for logic programming. 
In Proc. of ICLP’88, pp. 1070-1080, 1988. The MIT Press. 

7. M. Gelfond and V. Lifschitz. Classical negation in logic programs and disjunctive 
databases. New Generation Computing , 9:365-385, 1991. 

8. K. Godel. Zum intuitionistischen aussagenkalkul. Anzeiger der Akademie der Wis- 
senschaften Wien, mathematisch, naturwissenschaftliche Klasse, 69:65-66, 1932. 

9. Y. Gurevich. Intuitionistic logic with strong negation. Studia Logica, 36(l-2):49- 
59, 1977. 

10. R. Hahnle. Automated Deduction in Multiple- Valued Logics, volume 10 of Inter- 
national Series of Monographs on Computer Science. Oxford University Press, 
1994. 

11. I. Hasuo and R. Kashima. Kripke completeness of first-order constructive logics 
with strong negation. Logic Journal of the IGPL, ll(6):615-646, 2003. 

12. A. Heyting. Die formalen regeln der intuitionistischen logik. Sitzungsberichte der 
Preussischen Akademie der Wissenschaften, Physikalisch-mathematische Klasse, 
pp. 42-56, 1930. 

13. T. Janhumen, I. Niemela, D. Seipel, P. Simons, and J.-H. You. Unfolding partiality 
and disjunctions in stable model semantics. CoRR: cs.AI/0303009, March 2003. 

14. M. Kracht. On extensions of intermediate logics by strong negation. Journal of 
Philosophical Logic, 27 ( 1) :49— 73, 1998. 

15. N. Leone, G. Pfeifer, W. Faber, T. Eiter, G. Gottlob, S. Perri, and F. Scarcello. 
The dlv system for knowledge representation and reasoning. CoRR: cs.AI/02 11004, 
September 2003. 

16. V. Lifschitz, D. Pearce, and A. Valverde. Strongly equivalent logic programs. ACM 
Transactions on Computational Logic, 2(4):526-541, October 2001. 

17. V. Lifschitz, L.R. Tang, and H. Turner. Nested expressions in logic programs. 
Annals of Mathematics and Artificial Intelligence, 25(3-4):369-389, 1999. 




160 



D. Pearce and A. Val verde 



18. J. Lukasiewicz. Die logik und das grundlagenproblem. In Les entretiens de Zurich 
sur les fondements et la methode des sciences mathematiques (Zurich, 1938), pp. 
82-100, 1941. 

19. D. Nelson. Constructible falsity. Journal of Symbolic Logic, 14(2):16-26, 1949. 

20. D. Pearce. A new logical characterization of stable models and answer sets. In 
Proc. of NMELP 96, LNCS 1216, pp. 57-70. Springer, 1997. 

21. D. Pearce. From here to there: Stable negation in logic programming. In Dov 
Gabbay and Heinrich Wansing, editors, What is Negation?, pp. 161 181. Kluwer 
Academic Pub., 1999. 

22. D. Pearce. Simplifying logic programs under answer set semantics. In Vladimir 
Lifschtiz and Bart Demoen, editors, Proc. of ICLP04- Springer, 2004 (to appear). 

23. D. Pearce, I.P. de Guzman, and A. Valverde. Computing equilibrium models using 
signed formulas. In Proc. of CL2000, LNCS 1861, pp. 688-703. Springer, 2000. 

24. D. Pearce, I.P. de Guzman, and A. Valverde. A tableau calculus for equilibrium 
entailment. In Proc. of TABLEAUX 2000, LNAI 1847, pp. 352-367. Springer, 
2000 . 

25. D. Pearce, V. Sarsakov, T. Schaub, H. Tompits, and S. Woltran. A polynomial 
translation of logic programs with nested expressions into disjunctive logic pro- 
grams: Preliminary report. In Proc. of ICLP 2002, LNCS 2401, pp. 405-420. 
Springer, 2002. 

26. D. Pearce, H. Tompits, and S. Woltran. Encodings for equilibrium logic and logic 
programs with nested expressions. In Proc. of EPIA 2001, LNCS 2258, pp. 306- 
320. Springer, 2001. 

27. D. Pearce and A. Valverde. Uniform equivalence for equilibrium logic and logic 
programs. In Proc. of LPNMR’0), LNAI 2923, pp. 194-206. Springer, 2004. 

28. D. Pearce and A. Valverde. Synonymous theories in answer set programming and 
equilibrium logic. In Proc. of ECAI 200) (Valencia, Spain), 2004 (to appear). 

29. P. Simons, I. Niemela, and T. Soininen. Extending and implementing the stable 
model semantics. Artificial Intelligence, 138(1-2): 181-234, 2002. 

30. Y.S. Smetanich. On completeness of a propositional calculus with an additional 
operation of one variable (in russian). Trudy Moscovskogo Matematiceskogo Ob- 
scestova, 9:357-372, 1960. 

31. D. van Dalen. Intuitionistic logic. In Dov Gabbay and Franz Guenther, editors, 
Handbook of Philosophical Logic, Volume III: Alternatives in Classical Logic, Dor- 
drecht, 1986. D. Reidel Publishing Co. 

32. N.N. Vorob’ev. A constructive propositional calculus with strong negation (in 
russian). Doklady Akademii Nauk SSR, 85:465-468, 1952. 

33. N.N. Vorob’ev. The problem of deducibility in constructive propositional calculus 
with strong negation (in russian). Doklady Akademii Nauk SSR, 85:689-692, 1952. 




Characterizations for Relativized Notions of 
Equivalence in Answer Set Programming* 



Stefan Woltran 

Institut fiir Informationssysteme 184/3,Technische Universitat Wien, 
FavoritenstraGe 9-1 1, A- 1040 Vienna, Austria 
stef anOkr . tuwien. ac . at 



Abstract. Recent research in nonmonotonic logic programming focuses on al- 
ternative notions of equivalence. In particular, strong and uniform equivalence 
are both proposed as useful tools to optimize (parts of) a logic program. More 
specifically, given a set P of program rules and a possible optimization Q, strong 
(resp. uniform) equivalence requires that adding any set S of rules (resp. facts) to 
P and Q simultaneously results in equivalent programs, i.e., PUS and Q U S 
possess the same stable models. However, in practice it is often necessary to re- 
lax this condition in such a way, that dedicated internal atoms in P or Q are no 
longer allowed to occur in the possible extensions S. In this paper, we consider 
these relativized notions of both uniform and strong equivalence and provide se- 
mantical characterizations by generalizing the notions of UE- and SE-modelhood. 
These new characterizations capture all notions of equivalence including ordinary 
equivalence in a uniform way. Finally, we analyze the complexity of the intro- 
duced equivalence tests for the important classes of normal and disjunctive logic 
programs. As a by-product, we reduce the tests for relativized equivalences to or- 
dinary equivalence between two programs. These reductions may serve as a basis 
for implementation. 



1 Introduction 

Recent research in nonmonotonic logic programming focuses on different notions of 
equivalence between two logic programs. Besides the traditional notion of (ordinary) 
equivalence, i.e., checking whether two programs P and Q possess the same stable mod- 
els, the more restrictive notions of strong [15,29,23,17,4,3] and uniform equivalence [6, 
24,7,8] have been investigated. Formally, two programs P and Q are strongly equivalent 
(resp. uniformly equivalent), if, for any set S of rules (resp. atoms), the programs P U S 
and Q U S are equivalent in the ordinary sense. These notions have been proposed as 
a useful tool to change or optimize parts of logic programs avoiding an analysis of the 
whole program [29,22,7]. Indeed, if a program P contains a subprogram Q which is 
strongly equivalent to a program Q' , then one may replace Q in P by Q' , in particular if 
the resulting program is simpler to evaluate than the original one. Semantical characteri- 
zations as strong-equivalence models (SE-models [29]) or uniform-equivalence models 
(UE-models [6]) provide valuable tools for these purposes. 

* Supported by the Austrian Science Fund (FWF) under project P 1 5068-INF and by the European 
Commission under projects FET-2001-37004 WASP and IST-2001-33570 INFOMIX. 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, pp. 161-173, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




162 



S. Woltran 



However, in practice it is often convenient to relax the condition that arbitrary rules 
or facts are considered to be assigned to S. In particular, one wants to exclude dedicated 
atoms from the possible extension S. Such atoms may play the role of internal atoms in 
the compared program modules Q and Q' , and are not considered to appear anywhere 
else in the complete program P. 

Formally, we define strong (resp. uniform) equivalence relative to a given set of 
atoms A between two programs P and Q as the test whether, for all sets of rules S over 
A (resp. facts S C A), P U S and Q\J S have the same stable models. 1 

Relativizing uniform equivalence has already been considered in the context of DAT- 
ALOG, where the notions of uniform equivalence and (DATALOG)-equivalence are as 
follows: Two DATALOG programs P and Q are uniformly equivalent [18,26] iff, for 
any set A of atoms, P U A and Q U A have the same output; P and Q are (DATALOG)- 
equivalent iff, for any set S of external atoms (an atom is external if it does not occur in 
any rule head), PUS and ()US have the same output. Interestingly, uniform equivalence 
between DATALOG (Horn) programs is decidable [26], while DATALOG-equivalence 
is known to be undecidable [27] for this class of programs. Indeed, the latter notion can 
be seen as a special case of relativized uniform equivalence. This observation motivates 
to analyze the computational complexity of relativized notions of equivalence also in 
the propositional case, which we are interested in here. 

Our main results can be summarized as follows: 

1 . For both strong and uniform relativized equivalence we provide suitable semanti- 
cal characterizations by generalizing SE-models as well as UE-models. Our new 
characterizations of relativized SE- and UE-models allow to capture all considered 
notions of equivalence (including ordinary equivalence) in a uniform way. 

2. We show that relativized strong equivalence shares an important property with gen- 
eral strong equivalence, viz. that constraining the rules in the possible extensions to 
a very simple syntactical form does not lead to a different concept. 

3. We provide a reduction of the tests for relativized equivalences to ordinary equiva- 
lence between two programs. This may serve as a basis for implementation. 

4. Concerning complexity, we show that relativized uniform equivalence has the same 
worst-case complexity as general uniform equivalence, for both normal (coNP- 
completeness) and disjunctive logic programs (ZLf -completeness); and that rela- 
tivized strong equivalence has the same worst-case complexity as general strong 
equivalence in the case of normal logic programs, namely coNP-completeness. 

5. Between disjunctive logic programs P and Q, the complexity of strong equivalence 
relative to a set of atoms A is shown to remain in coNP, whenever the number of 
atoms in P and Q not contained in A is bounded by a constant; in general, the task 
is IZf -complete. 

A general notion of equivalence has also been introduced by Inoue and Sakama [12]. In 
their framework, called update equivalence, one can exactly specify a set of arbitrary 
rules which may be added to the programs under consideration and, furthermore, a set of 
rules which may be deleted. However, for this explicit enumeration of rules it seems more 

1 The notion of strong equivalence relative to a given set of atoms was suggested by Lin in [17] 
but not further investigated. 




Characterizations for Relativized Notions of Equivalence in Answer Set Programming 



163 



complicate to suitably extend the important characterizations of (generalized variants 
of) SE-models and UE-models in a reasonable way to that approach. 

2 Preliminaries 

We deal with propositional disjunctive logic programs, containing rules r (over a set of 
atoms At) of the form 

Oi V • • • V a t -1- &i, . . . ,b m , notb m+1 , . . . ,notb n , (1) 

(, l > 0, n > to > 0), where all a,, bj are from At, and not denotes default negation. 
A rule r is normal, if l < 1; unary if l = 1 and m = n < 1; a fact, if Z = 1 and 
m = n = 0. For facts, we sometimes write a instead of a 3—. The head of r is the set 
H{f) = {ai, . . . , ai}: the body of r is B(r) = {b±, .. ., b m , not 6 m +i, . . . , not b n }. We 
also define B + (r) = {6i, . . . , b rn } and B~(r) = {b m+1 , ..., b n }. 

A disjunctive logic program (DLP) over At, or simply a program, is a finite set of 
rules over At. A DLP P is called a normal logic program (NLP) (resp. unary program) 
if every rule in P is normal (resp. unary). The set of all atoms occurring in a program P 
is denoted by atm(P). 

We recall the stable-model semantics for DLPs [11,25]. Let I be an interpretation, 
i.e., a set of atoms. An atom a is true under I iff a £ I, and false under I otherwise. 
I satisfies a rule r, denoted I |= r, iff I IT H(r ) ^ 0, whenever B + (r) C I and 
I fl B~ (r) = 0 jointly hold. Furthermore, I is a model of a program P, denoted I \= P, 
iff I \= r, for all r £ P. Note that the empty program has any interpretation as its model. 
The Gelfond-Lifschitz reduct of a program P relative to a set of atoms I is the program 
P 1 = { H(r ) 3— B + (r ) | r £ P, B~(r) fl / = 0}. An interpretation I is a stable model 
of a program P iff I is a minimal model (under set inclusion) of P 1 . An equivalent 
characterization of stable models is as follows: I is a stable model of a program P, iff 
I \= P and for each J C I, J P 1 . The set of all stable models of P is denoted by 
SM(P). 

Under stable semantics, two DLPs P and Q are regarded as equivalent, denoted 
P = Q, iff SA4 ( P) = SM(Q). The more restrictive forms of strong equivalence and 
uniform equivalence are as follows: 

Definition 1. Let P and Q be two DLPs. Then, 

(i) P and Q are strongly equivalent, denoted P = s Q, iff, for any set R of rules, 
PUR = Q UR; 

(ii) P and Q are uniformly equivalent, denoted P = u Q, iff, for any set F of facts, 
P U F = Q U F. 

As an example, consider the two programs 

P = {a V b £- ; a £- b; b <— a; 3— not c}; and 
P' = {a ■£- not b\ b ■£- not a; a ■£- b\ b ■£- a; £- not c}. 

The only difference between P and P' is that a V b 3— is replaced by the two rules 
a ■£- not b: b £- not a. Although, P and P' are equivalent (both have no stable model, 




164 



S. Woltran 



since c cannot be derived), it can be checked that P P' holds. Thus, P P' holds, 
as well. In particular, it suffices to add a fact c; then, P U {c} has {a, b, c} as a stable 
model, while P' U {c} has no stable model. 

Both uniform and strong equivalence enjoy interesting semantical characteriza- 
tions [15,28,29,6], 

Definition 2. A pair ( X , Y) of interpretations such that X C Y is called an SE- 
interpretation. An SE-interpretation (X, Y) is an SE-model of a program P, ifY [= P 
and X |= P Y . 

Proposition 1 ([28,29]). Two programs P and Q are strongly equivalent iff they possess 
the same set of SE-models. 

Recently, the following pendant to SE-models, characterizing uniform equivalence 
for (finite) logic programs, has been defined [6]. 

Definition 3. Let P be a program and ( X , Y) an SE-model of P. Then, ( X , Y ) is an 
UE-model of P iff, for every SE-model (X 1 ,Y) of P, it holds that X C X' implies 
X' = Y. 

Proposition 2 ([6]). Two programs P and Q are uniformly equivalent iff they possess 
the same set of UE-models. 

To check strong or uniform equivalence between two programs P and Q, it is ob- 
viously sufficient to consider SE-interpretations (X, Y) over atm(P U Q), i.e., with 
X C Y C atm(PU Q). We implicitly make use of this simplification when conve- 
nient. 

Recall our example programs P and P' . The SE-models (over {a, b , c}) of the pro- 
gram P are given by 2 ( abc , abc) , ( ab , abc) ; whilst the program P' has two additional 
SE-models (c, abc), (0, abc). Hence, P and P' are not strongly equivalent. Concerning 
uniform equivalence, note that the pair (0, abc) is not an UE-model of P' , since (ab, abc) 
(or (c, abc), alternatively) is an SE-model of P' preventing (0, abc) to be a UE-model 
of P' by definition. Still, P and P' have different UE-models, i.e., (abc, abc), (ab, abc) 
for P and additionally (c, abc) for P' . By Proposition 2, this yields P f u P' as claimed 
above. 

Finally, we review the complexity results for equivalence testing in logic program- 
ming for the propositional case. 

Proposition 3. For normal logic programs, the problems of ordinary, strong, or uniform 
equivalence are complete for the class coNP. In the case of disjunctive logic programs, 
the complexity remains coY\P -complete for strong equivalence, while deciding uniform 
or ordinary equivalence is II 2 -complete for DLPs. 

The results for ordinary equivalence can be obtained from results by Marek and 
Truszczynski [19] for normal logic programs (cf. also [1]) and by Eiter and Gottlob for 
disjunctive logic programs [9] (for an explicit proof we refer to [21]). Complexity of 
uniform equivalence has been analyzed by Eiter and Fink [6] and the results concerning 
strong equivalence have been shown by several authors [23,29,17]. 

2 We write abc instead of {a, b, c}, a instead of {a}, etc. 




Characterizations for Relativized Notions of Equivalence in Answer Set Programming 



165 



3 Relativizing Strong and Uniform Equivalence 

In what follows, we formally introduce the notions of relativized strong equivalence 
(RSE) and relativized uniform equivalence (RUE). 

Definition 4. Let P and Q be programs and let A be a set of atoms. Then, 

(i) P and Q are strongly equivalent relative to A, denoted P =f Q, iff PUR = QUR, 
for all programs R over A; 

( ii ) P and Q are uniformly equivalent relative to A, denoted P =£ Q, iff P U F = 
Q U F, for all facts F C A. 

Observe that the range of applicability of these notions covers ordinary equivalence 
(by setting A — 0) between two programs P, Q, and general strong (resp. uniform) 
equivalence (whenever atm(P U Q) C A). Also the following relation holds: For any 
set A of atoms, let A' = A D atm(P U Q). Then, P =f Q holds, iff P =£ Q holds, 
for e £ {s, u}. 

Recall our example programs P and P' . In the previous section, we have seen that 
these programs are neither uniformly equivalent nor strongly equivalent to each other. 
With the concepts of RSE and RUE at hand, we are able to draw a more fine-grained 
picture. In particular, we can choose any A with c f. A, and get P =f P' and P =„ P' . 
In the next section, we will present a model-theoretic characterization to verify this 
claim. 

For technical reasons, we also introduce the following concepts: 

Definition 5. Let P and Q be programs, and let Abe a set of atoms. Then, 

(i) P \=f Q, iffSfA(PUR) C SM.(Q U R), for all programs R over A; 

(ii) P |=f Q, iffSM{P U F) C SM (Q U F), for all F C A. 

Hence, P =£ Q holds, iff P \=f Q and Q \=f P jointly hold, for e £ {s, it}. 

Our first main result lists some properties for relativized strong equivalence. Among 
them, we show that RSE shares an important property with general strong equivalence: In 
particular, from the proofs of the results in [15,29], it appears that for strong equivalence, 
only the addition of unary rules is crucial. That is, by constraining the rules in the set R 
in Definition 1 to unary ones does not lead to a different concept. 

Lemma 1. For programs P, Q, and a set of atoms A, the following propositions are 
equivalent: 

(1) P Q: 

(2) there exists a unary program U over A, such that SA4(P U U) % SA4(Q U U); 

(3) there exists an interpretation Y, such that (a) Y \= P; (b) for each Y' C Y with 
{Y' D A) = {Y (T A), Y' P 5 holds; and (c) Y f= Q implies existence of an 
X C Y, such that X \= Cf 1 and, for each X' C Y with {X' D A) = ( X (T A), 
X' P Y holds. 

Proof. ( 1 ) implies (3): Suppose an interpretation Y and a set R of rules over A , such that 
Y £ SM(PUR) and Y f. SM(Q U R). From Y £ SM(PUR), we get F \= PUR 
and, for each Z C Y, Z \f= P^ U R^ . Thus (a) holds, and since Y' (= R} holds, for 




166 



S. Woltran 



each Y' with (Y' fl A) = (Y (~l A), (b) holds as well. From Y SAi(QUR), we get that 
either Y Q U R .or there exists an interpretation X C Y, such that X |= () Y U l\ . 
Note that Y Q U R implies Y Q, since from above, we have Y |= R. Thus, in the 
case of Y Q U R, (c) holds; otherwise we get that X \= Q y . Now since X \= Rf , 
we know that, for each X' C Y with ( X ' fl A) = (X fl A), X' \/L P Y has to hold, 
otherwise Y SA4(P U R). Hence, (c) is satisfied. 

(3) implies (2): Suppose an interpretation Y, such that Conditions (a-c) hold. We have 
two cases: First, if Y Q, consider the unary program U = (Y fl A). By Conditions 
(a) and (b), it is easily seen that Y G SA4(PU U), and from Y \f= Q, Y ^SJYl(QuU) 
follows. So suppose, Y f= Q. By (c), there exists an X C Y, such that X \ = Q y . 
Consider the program U = ( X fl A) U {p <— q \ p, q G (Y \ X) fl A}. Again, U is 
unary over A. Clearly, Y |= Q U U and X (= Q' 1 U U. Thus Y (f SAA(Q U U). It 
remains to show that Y G SA4 (PUU). We have Y |= PUlJ. Towards a contradiction, 
suppose a Z C Y, such that Z \= P Y U U. By definition of U, Z D (X fl A). If 
(Z fl A) = {X fl A), Condition (c) is violated; if (Z fl A) = (Y fl A), Condition (b) is 
violated. Thus, ( X fl A) C (Z fl A) C {Y fl A). But then, Z U, since there exists at 
least one rule p q in U, such that q G Z and p ^ Z. Contradiction. 

(2) implies (1) holds by definition. □ 



Corollary 1 . For programs P, Q, and a set of atoms A, P =f Q holds iff, for each 
unary program U over A, PUU=QUU holds. 

4 A Characterization for Relativized Strong Equivalence 

In this section, we provide a semantical characterization of RSE by generalizing the 
notion of SE-models. Hence, our aim is to capture the problem P =£ Q in pure model- 
theoretic terms. Moreover, having found a suitable notion of relativized SE-models, we 
expect that a corresponding pendant for RUE can be derived in the same manner as 
general UE-models are defined over general SE-models. 

We introduce the following notion. 

Definition 6. Let Abe a set of atoms. A pair of interpretations (X, Y) is a (relativized) 
A-SE-interpretation iff either X = Y or X C (Y fl A). Moreover, (X, Y) is a (rela- 
tivized) A-SE-model of a program P iff 

(i) Y\=P; 

(ii) for all Y' C Y with (Y 1 n A) = (Y n A), Y' ^ P Y ; and 

(Hi) X C Y implies existence of a X' C Y with (X' fl A) = X, such that X' |= P Y 
holds. 

Compared to SE-models, this definition is more involved. This is due to the fact, 
that we have to take care of two different effects when relativizing strong equivalence. 
The first one is as follows: Suppose a program P has among its SE-models the pairs 
(Y, Y) and (Y', Y) with (Y' fl A) = (Y fl A) and Y' C Y. Regardless of the rules 
R over A we add to P, Y' Y= (P U R) Y always implies Y P U R. In other words, 
Y is not a stable model of P U R, for any R over A. Hence, in this situation, we do 




Characterizations for Relativized Notions of Equivalence in Answer Set Programming 



167 



not pay attention to any original SE-model from P of the form (Z, Y). This motivates 
Condition (ii). Condition (iii) deals with a different effect: Suppose P has SE-models 
( X , Y) and (X' , F), with (JnT) = ( X' fl A) C (Y (T A). Again, it is not possible to 
eliminate just one of these two SE-models by adding rules over A. Such SE-models which 
do not differ with respect to A, are collected into a single A-SE-model ((X IT A), Y). 

The different role of these two independent conditions becomes even more apparent 
in the following cases. On the one hand, setting A = 0, the A-SE-models of a program 
P collapse with the stable models of P. More precisely, all such 0-SE-models have to 
be of the form ( Y. Y), and it holds that (Y. Y) is an 0-SE-model of a DLP P iff Y is a 
stable model of P. This is easily seen by the fact that under A = 0, Conditions (i) and (ii) 
in Definition 6 exactly coincide with the characterization of stable models. Therefore, 
A -SE-model-checking for DLPs is not possible in polynomial time in the general case; 
otherwise we get that checking whether a DLP has some stable model is NP-complete; 
which is in contradiction to known results [9], provided the polynomial hierarchy does 
not collapse. On the other hand, if each atom from P is contained in A, then the A-SL- 
models of P coincide with the SE-models (over A) of P. The conditions in Definition 6 
are hereby instantiated as follows: A pair (X, Y) is an A-SE-interpretation iff X C Y, 
and by (i) we get Y \= P, (ii) is trivially satisfied, and (iii) states X | = P 5 . 

Next, we list some immediate observations. 

Lemma 2. Let P be a program and Abe a set of atoms. We have the following relations 
between A-SE-models and SE-models. 

- If (Y, Y) is an A-SE-model of P, then (Y. Y) is an SE-model of P. 

- If (X, Y) is an A-SE-model of P, then (X' , Y) is an SE-model of P, for some X' 
with X' = (X n A). 

Let us compute the relativized SE-models of our example programs P and P' already 
used in previous sections. Indeed, we expect that P =£ P' holds exactly if the A-SE- 
models of P and P' coincide. By above lemma, it suffices to consider the SE-models of 
a program and check whether they result in corresponding A-SE-models. P has got two 
SE-models, ( abc, abc) and (ab, abc). Hence, Condition (ii) in Definition 6 is satisfied, 
only if c G A. In this case, P possesses two A-SE-models (abc, abc) and (X, abc) where 
X = An {a, b}. In each other case P' has no A-SE-model, since there is a Y' C Y 
with (Y' IT A) = (Y IT A), such that Y' |= P y , for Y = {a, b, c}. This is similar for P'. 
In particular, P' has SE-models (abc, abc), (ab, abc), (c, abc), (0, abc). Thus, whenever 
c € A, (c, abc) remains to be an A-SE-model of P’ , yielding different A-SE-models for 
P and P’ . Otherwise, i.e., c (f .4, P' has no A-SE-models. Therefore, the A-SE-models 
of P and P 1 coincide, whenever c ^ A. 

For a further example, consider the programs 

Q = {a V b 4- ■ a 4— c; b 4— c; 4— not c; c A- a, 6}; 

Q' = {a A- not b: b 4— not a; a 4— c; b 4— c; 4— not c; c 4— a, b}. 

Thus, Q' results from Q by replacing the disjunctive rule a V i f- by the two rules 

a 4— not b\ b 4— not a. 




168 



S. Woltran 



Table 1 . Comparing the H-SE-models for example programs Q and Q' . 



A 


H-SE-models of Q 


H-SE-models of Q' 


{a, b, c} 


(abc, abc), (a, abc), ( b , abc) 


(abc, abc), (a, abc), (b, abc), (0, abc) 


{a, 6} 


(abc, abc), (a, abc), (b, abc) 


(abc, abc), (a, abc), (b, abc), (0, abc) 


{a,c} 


(abc, abc), (a, abc), (0, abc) 


(abc, abc), (a, abc), (0, abc) 


{6,c} 


(abc, abc), (0, abc), (b, abc) 


(abc, abc), (b, abc), (0, abc) 


{«} 


- 


- 


{b} 


- 


- 


{c} 


(abc, abc), (0, abc) 


(abc, abc), (0, abc) 


0 


- 


- 



Table 1 lists, for each A C {a, b, c}, the ,4-SE-models of Q and Q', respectively. 
The first row of the table gives the SE-models (over {a, b. c}) for Q and ()' . Observe 
that we have Q jk s Q' . The second row shows that, for A = {a, b}, Q Q' , as well. 
Indeed, adding R = {a 6; b <— a} yields {a, b, c} as stable model of Q U R, whereas 
Q' U R has no stable model. For all other A C {a, b, c}, the ,4-SE-models of Q and Q' 
coincide. Basically, there are two different reasons. First, for A = {a, c}, A = {b, c}, 
or A = {c}. Condition (iii) from Definition 6 comes into play. In those cases, at least 
one of the SE-interpretations (a, abc) or ( b , abc) is “switched” to (0, abc ), and thus the 
original difference between the SE-models disappears when considering ,4-SE-models. 
In the remaining cases, i.e., A C {a, b}, Condition (ii) prevents any ( X , abc ) to be an 
,4-SE-model. Then, neither Q nor ()' possesses any ,4-SE-model. 

The general result is as follows. In particular, we show that ,4-SE-models capture 
the notion of in the same manner as SE-models capture = s . 

Theorem 1. For programs P, Q, and a set of atoms A, P =f Q holds iff P and Q 
possess the same A-SE-models. 

Proof First suppose P Q and wlog consider P \f=^ Q. By Lemma 1, there exists 
an interpretation Y, such that (a) Y |= P; (b) for each Y' C Y with(Y'nH) = (YPA), 
Y' P Y \ and (c )Y \f= Q or there exists an interpretation X C Y, such that X \= Q Y 
and, for each X' C Y with ( X ' fl A) = (X fl A), X' P' 1 . First suppose Y Q, 
or Y \= Q and ( X fl A) = (Y fl A). Then (Y, Y) is an A-SE-model of P but not of Q. 
Otherwise, i.e., Y \= Q and ( X fl A) C (Y fl A), ((X fl A), Y) is an ^4-SE-model of 
Q. But, by Condition (c), ((X fl A), Y) is not an A-SE-model of P. 

For the converse direction of the theorem, suppose a pair (Z,Y), such that wlog 
( Z , Y) is an A-SE-model of P but not of Q. First, let Z = Y . We show P Q. 
Since (}\ Y) is an ,4-SE-model of P, we get from Definition 6, that Y |= P and, for 
each Y' C Y with (Y A A) = {Y' fl A), Y' P Y . Thus, Conditions (a) and (b) 
in Part (3) of Lemma 1 are satisfied for P by Y . On the other hand, (Y, Y) is not an 
,4-SE-model of Q. By Definition 6, either Y f=- (f or there exists a Y' C Y, with 
(Y' fl A) = {Y flA), such that Y' |= . Therefore, Condition (c) from Lemma 1 is 

satisfied by either Y Q or, if Y f= Q, by setting X = Y' . We apply Lemma 1 and get 
P Q. Consequently, P Q. So suppose, Z f Y. We show that then Q P 
holds. First, observe that whenever (Z, Y ) is an ,4-SE-model of P, then also ( Y. Y) 
is an ,4-SE-model of P. Hence, the case where (' Y, Y) is not an ,4-SE-model of Q is 




Characterizations for Relativized Notions of Equivalence in Answer Set Programming 



169 



already shown. So, suppose (Y, Y) is an A-SE-model of Q. We have V f= Q and, for 
each Y' C Y with (Y' fl A) = (Y C\ A), Y’ \/L . This satisfies Conditions (a) and (b) 

in Lemma 1 for Q. However, since (Z, Y) is not an .4-SE-modcl of Q, for each X' C Y 
with {X' nA) = Z,X'^=Q Y holds. Since ( Z, Y) in turn is an A-SE-model of P, there 
exists an X" C Y with {X" fl A) — Z, such that X" |= P v . These observations imply 
that (c) holds in Lemma 1. We apply the lemma and get Q P. Hence, P ffff Q. □ 

5 Characterizing Relativized Uniform Equivalence 

In this section, we present a characterization for deciding =„ , i.e., uniform equivalence 
relative to a set of atoms A. As mentioned before, we aim at defining relativized ,4-UE- 
models over A-SE-models in the same manner as general UE-models are defined over 
general SE-models, following Definition 3. We thus define the following concept. 

Definition 7. Let Abe a set of atoms and P be a program. A pair (X, Y) A a (relativized) 
A-UE-model of P iff it is an A-SE-model of P and, for every A-SE-model (X r , Y) of 
P,XcX' implies X' = Y. 

We suitably adapt parts of Lemma 1 . 

Lemma 3. For programs P and Q, and a set of atoms A, P Y=u Q holds iff there exists 
an interpretation Y, such that (a) Y |= P; (b)foreachY' C Y with (Y' C\A) = (YnA), 
Y 7 \f P 5 holds; and (c) Y \= Q implies existence of an X C Y, such that X |= Q Y 
and, for each X' with ( X n A) C X' C Y, X' P * holds. 

Proof. For the only-if direction, suppose sets of atoms Y and F C A, such that Y £ 
SM{P U F) and Y SM(Q U F). We get Y |= P U F and, for each Z C Y, 
Z P' 1 U F. Conditions (a) and (b) thus hold. Since Y SXi(Q U F), either 

Y Q U F oi' there exists an interpretation F C X C Y, such that X \ = . 

Y \f Q U F implies Y Q, since Y |= PUP; otherwise, i.e., X |= Q y , we know 
from Z P^ UP that, for each X' with (X fl A) C X' C Y, X' P y has to hold; 
otherwise Y SM.(P U P). Hence, (c) is satisfied. 

For the if-direction, suppose a set Y, such that Conditions (a-c) hold. We have two 
cases: First, if Y \f= Q, we set P = ( Y fl A) . By (a) and (b), we derive Y £ SM(PUF). 
Since Y Q. Y ^ SA4(Q U P) follows, and we are done. So suppose Y f= Q and 
existence of an X C Y, such that X \ = Qf . We set P = (X fl A). Clearly, Y |= Q U P 
andX |= Q y UP.ThusY ^ S A4(Q UP). It remains to show that Y £ <SA4(PUP). We 
haveY |= PUP. By Condition (c), each X' with (Xfl A) C X' C Y satisfies X' P ^ . 
For each other X' C Y, we have X' C (X fl A), and thus X' F by definition of P. 
Hence, X' P Y U R holds for each X' C Y, yielding Y £ SM(P UP). □ 

Next, we can derive the desired characterization for relativized uniform equivalence. 

Theorem 2. For programs P, Q, and a set of atoms A, P =£ Q holds iff P and Q 
possess the same A-UE-models. 




170 



S. Woltran 



Proof. First suppose P Q and wlog consider P Q. By Lemma 3, there exists 
an interpretation Y, such that (a) Y f= P; (b) for each Y' C Y with(Y'nA) = (YnA), 
Y' P Y \ and (c )Y \f= Q or there exists an interpretation X CY, such that X (= Q Y 
and, for each X' with (X n A) C X' C Y, X' ^ P Y . If Y ^ Q or Y \= Q and 
( X fl A) = ( Y n A), then (Y, Y) is an A-SE-model of P but not of Q. Consequently, 
also the A-UE-models of P and Q differ. Otherwise, i.e,,Y (= <3and(XnA) C (Yf lA), 
((X fl -4), Y) is an A-SE-model of Q. But Condition (c) guarantees, that any A-SE- 
model (X' , Y) of P with (Xfl A) C X' satisfies X' = Y . Thus, we get an A-UE-model 
(. X ' , Y) of Q with (Ifl4)CI'c Y, which cannot be an A-UE-model of P. 

For the converse direction, suppose wlog a pair (Z, Y) is an .4 -LIE- mode I of P but 
not of Q. The case of Z = Y proceeds similar as in the proof of Theorem 1, since 
(Y, Y) is an A-UE-model of a program iff (Y, Y) is an A-SE-model of it. So suppose, 
Z f^Y . Since (Z, Y) is an A-UE-model of P, (Y, Y) is an A-UE-models of P, as well. 
We further assume that (Y, Y) is an .4-UE-model of Q\ the other case is already shown. 
We thus have Y |= P U Q and, for each Y'cY with (Y' n A) = (Y fl A), Y' P Y 
and Y' Qf ; i.e.. Conditions (a) and (b) from Lemma 3 hold in both cases P Q 
and Q P- There are two possible reasons for (Z, Y) not being an .4-lJL-model of 
Q: (Z. Y) is an A-SE-model of Q, but there exists a Z' with Z C Z' C Y , such that 
(Z',Y) is an .4-SE-model of Q, as well; or the A-SE-interpretation (Z. Y) is not an 
^4-SE-model of Q. First, suppose there exists a Z’ with Z C Z’ C Y, such that (Z’ . Y) 
is an .4-SL-model of Q. Then, (c) holds since there exists an X with (X fl .4 j = Z', such 
that X |= Q y , and, for each X’ with Z’ C X' C Y, X' P } holds by assumption 
that (Z, Y) is an ^4-UE-model of P. By Lemma 3, we get P [Y„ Q. Second, suppose 
no Z' with Z C Z' C Y yields an ^4-SE-model (Z',Y) of Q and (Z,Y) is not an 
^4-SE-model of Q. Then, (c) holds for Q since there exists an X with ( X fl A) = Z, 
such that X f= P Y , and no X' with Z C X’ C Y satisfies X' (= . By Lemma 3, we 

get Q P- We thus have either P Q or Q P. Both cases imply P Q. □ 

Recall our example programs Q and Q' from above. Via the first row in the table (i.e., 
for A = {a, b. c}, yielding the respective SE-models), it easily checked by Proposition 2 
that Q and Q' are uniformly equivalent. In fact, the SE- model (0, abc) of O' is not a 
UE-model of Q' , due to the presence of the SE-model (a, abc), or alternatively because 
of ( b , abc). Note that Q = u Q' implies Q =£ Q' for any A. Inspecting the remaining 
lines in the table, it can be checked that, for any A, the sets of .4-UE-models of Q and 
Q' are equal, as expected. 

As a final remark, we mention that .4-UE-models enjoy the same property as A-SE- 
models for characterizing stable models. In particular, we have a one-to-one correspon- 
dence between the 0-UE-models of a program P and the stable models of P. 



6 Complexity and Implementation Issues 

In this section, we first present a method to decide RSE and RUE via reductions to 
ordinary equivalence tests. Afterwards we utilize these reductions in order to examine the 
computational complexity of RSE and RUE. We remark that the forthcoming reductions 
also hold for the general forms of uniform and strong equivalence. These reductions have 




Characterizations for Relativized Notions of Equivalence in Answer Set Programming 



171 



so far not been presented in the literature we are aware of. 3 We note that the presented 
method is relevant in practice by composing our reductions with implementations to 
check equivalence between programs [14,21]. 

Theorem 3. Let Pi and P 2 be programs and Abe a set of atoms. Moreover, let p a and 
p a be new atoms for each a £ A, and define 

P'i = {Pa <- not pa ; pa not Pa \ CL £ A} U 
{a £- p a I a G A} U 

Pi\ 

for * = 1,2. Then, P r =£ P 2 iff P[ = P' 2 . 

Intuitively, the first two rules guess any truth assignment to the atoms p a . Then, 
whenever p a is contained in the guess, a is derived. Hence, the added rules “simulate” 
each possible extension of facts to the Pf s simultaneously. A formal proof is easily 
obtained by using the well-known splitting-theorem [ 10, 1 6] . A similar procedure can be 
given for RSE making use of the restriction to unary rules, following Corollary 1 . 

Theorem 4. Let Pi and P 2 be programs and Abe a set of atoms. Moreover, let p a ,b ond 
p a ,b be new atoms for any a,b £ A, and define 

P'i = {p a ,b £~ notp a y, p a ,b £~ notp a ,b \ a,b £ A} U 

{a 4- p a , a |aei}U{a<-6, p a ,b | a, b £ A, a ^ b} U 

Pi\ 

for * = 1,2. Then, Pi =? P 2 iff P[ = Pf 

Here, we guess any set of unary rules over A via truth assignments to atoms p a ,b- 
Note that p a a refers to the fact a 4 — rather than to the (tautological) rule a 4 — a. 

The program P' as defined in Theorem 3 is clearly constructible in polynomial time 
from P. Moreover, P' is normal whenever P is normal. This observation gives the re- 
spective membership results for RUE in the forthcoming theorem. Hardness follows, for 
instance, from the complexity results for ordinary equivalence, following Proposition 3. 
Similarly, this argumentation holds for relativized strong equivalence. 

Theorem 5. Given two programs P and Q, a set of atoms A, and e £ {s, u}, deciding 
whether P =£ Q is Ilf -complete. If P and Q are normal, then P =£ Q is coNP- 
complete. 

However, it is worthful to pay additional attention to the case of strong equivalence 
relative to A between DLPs. Compared to the other cases, the complexity between the 
two ends of the range for A differs. As mentioned earlier, for A = 0, we have P =£ Q 
iff P = Q. The latter test is Ilf -complete, hence Ilf -completeness for P =£ Q is 
derived. On the other end, for A = atm{P U Q), we have P =f Q iff P = s Q. The 
latter test is coNP-complete, and thus is less involving. We identify the following frontier 
between coNP- and Il f -hardness for RSE. 

3 Alternative methods — which reduce the general variants of uniform and strong equivalence to 
the consistency problem of stable logic programming — can be found, e.g., in [8]. 




172 



S. Woltran 



Theorem 6. For DLPs P and Q, and a set of atoms A, the test for P =f Q is coNP- 
complete, whenever the cardinality of the set atm{P U Q) \ A is bounded by a constant. 

Proof. The proof is shown via the complementary problem, which is in NP. It suffices 
to show that the test for P Q is in NP. We guess an interpretation Y and check 
Conditions (a-c) from Lemma 1. Suppose the cardinality of the set ( atm(P UQ)) \ A 
is bounded by a constant k. Then, we need in the worst case one additional guess for 
X and 2 k+1 + 3 entailment tests to check the conditions, by unfolding the universal 
quantifications in Conditions (b) and (c). Entailment tests can be done in polynomial 
time. This shows membership in coNP for bounded />:. Hardness follows from the coNP- 
completeness of strong equivalence. □ 

7 Conclusion 

In this paper, we introduced relativized variants of strong and uniform equivalence 
between logic programs. Moreover, we suitably adapted the important notions of SE- 
models and UE-models in order to provide a uniform semantical characterization. 

We showed that for normal logic programs, relativizing equivalence does not increase 
the computational complexity compared to the original notions of strong (resp. uniform) 
equivalence. This holds in the case of RUE between disjunctive logic programs, as well. 
In the case of P =f Q between DLPs, however, we witnessed an increase of the com- 
putational complexity, whenever the number of atoms atm(P U Q) \ A is unbounded; 
otherwise, P =f Q is coNP-complete. Note that this result is acceptable in a practical 
setting, since it is often sufficient to exclude from A only a small number of (internal) 
atoms occurring in the considered programs in order to check P =f Q. 

Our ongoing and future work concerns the application of these newly introduced 
relativized SE- and UE-models for program transformation rules as presented in [2], 
complementing recent research within this area [8,22] . Moreover, we investigate whether 
criteria for disjunction elimination [7] can be suitably generalized to relativized SE- 
models. Finally, we plan to extend our results to further classes of logic programs, 
viz. extended logic programs containing strong negation, nested logic programs, and 
to the function-free first-order (DATALOG) case. Moreover, we consider to explore 
how our results can be applied for optimizations of algorithms used in disjunctive logic 
programming engines such as DLV [5] and smodels+GnT [13]. 



Acknowledgments. The author would like to thank Chiaki Sakama, Thomas Eiter, and 
Hans Tompits, as well as the anonymous referees for their valuable comments which 
helped improving the paper. 



References 

1. N. Bidoit and C. Froidevaux. General Logical Databases and Programs: Default Logic Se- 
mantics and Stratification. Information and Computation, 91:15-54, 1991. 

2. S. Brass and J. Dix. Semantics of (Disjunctive) Logic Programs Based on Partial Evaluation. 
Journal of Logic Programming, 38(3): 167-213, 1999. 




Characterizations for Relativized Notions of Equivalence in Answer Set Programming 



173 



3. P. Cabalar. A Three- Valued Characterization for Strong Equivalence of Logic Programs. In 
Proc. AAAI-02, pg. 106-1 1 1 . AAAI Press/MIT Press, 2002. 

4. D. de Jongh and L. Hendriks. Characterizations of Strongly Equivalent Logic Programs in 
Intermediate Logics. Theory and Practice of Logic Programming, 3(3):259-270, 2003. 

5. T. Eiter, W. Faber. N. Leone, and G. Pfeifer. Declarative Problem-Solving Using the DLV 
System. In Logic-Based Artificial Intelligence, pg. 79-103. Kluwer Academic, 2000. 

6. T. Eiter and M. Fink. Uniform Equivalence of Logic Programs under the Stable Model 
Semantics. In Proc. ICLP-03, LNCS 2916, pg. 224-238. Springer, 2003. 

7. T. Eiter, M. Fink. H. Tompits, and S. Woltran. On Eliminating Disjunctions in Stable Logic 
Programming. In Proc. KR-04, pg. 447-585. AAAI-Press, 2004. 

8. T. Eiter, M. Fink, H. Tompits, and S. Woltran. Simplifying Logic Programs under Uniform 
and Strong Equivalence. In Proc. LPNMR-04, LNCS 2923, pg. 87-99. Springer, 2004. 

9. T. Eiter and G. Gottlob. On the Computational Cost of Disjunctive Logic Programming: 
Propositional Case. Annals of Math, and Artificial Intelligence, 15(3/4):289-323, 1995. 

10. T. Eiter, G. Gottlob, and H. Mannila. Disjunctive Datalog. ACM Transactions on Database 
Systems, 22:364-418, 1997. 

11. M. GelfondandV. Lifschitz. Classical Negation in Logic Programs and Disjunctive Databases. 
New Generation Computing, 9:365-385, 1991. 

12. K. Inoue and C. Sakama. Equivalence of Logic Programs under Updates. In Proc. JELIA-04. 
This volume. 

13. T. Janhunen, I. Niemela, P. Simons, and J.-H. You. Partiality and Disjunctions in Stable Model 
Semantics. In Proc. KR-00, pg. 41 1^-19. Morgan Kaufmann, 2000. 

14. T. Janhunen and E. Oikarinen. LPEQ and DLPEQ - Translators for Automated Equivalence 
Testing of Logic Programs. In Proc. LPNMR-04, LNCS 2923. pg. 336-340. Springer, 2004. 

15. V. Lifschitz, D. Pearce, and A. Valverde. Strongly Equivalent Logic Programs. ACM Trans- 
actions on Computational Logic, 2(4): 526-541, 2001. 

16. V. Lifschitz and H. Turner. Splitting a Logic Program. In Proc. ICLP-94, pg. 23-38. 1994. 

17. F. Lin. Reducing Strong Equivalence of Logic Programs to Entailment in Classical Proposi- 
tional Logic. In Proc. KR-02, pg. 170-176. Morgan Kaufmann, 2002. 

18. M. J. Maher. Equivalences of Logic Programs. In Minker [20], pg. 627-658. 

19. W. Marek and M. Truszczyriski. Autoepistemic Logic. J. of the ACM, 38(3):588— 619, 1991. 

20. J. Minker, editor. Foundations of Deductive Databases and Logic Programming. Morgan 
Kaufmann, 1988. 

21. E. Oikarinen andT. Janhunen. Verifying the Equivalence of Logic programs in the Disjunctive 
Case. In Proc. LPNMR-04, LNCS 2923. pg. 180-193. Springer, 2004. 

22. M. Osorio, J. A. Navarro, and J. Arrazola. Equivalence in Answer Set Programming. In Proc. 
LOPSTR-01, LNCS 2372, pg. 57-75. Springer, 2001. 

23. D. Pearce. H. Tompits, and S. Woltran. Encodings for Equilibrium Logic and Logic Programs 
with Nested Expressions. In Proc. EPIA-01, LNCS 2258, pg. 306-320. Springer, 2001. 

24. D. Pearce and A. Valverde. Uniform Equivalence for Equilibrium Logic and Logic Programs. 
In Proc. LPNMR-04, LNCS 2923, pg. 194-206. Springer, 2004. 

25. T. Przymusinski. Stable Semantics for Disjunctive Programs. New Generation Computing 
Journal, 9:401-424, 1991. 

26. Y. Sagiv. Optimizing Datalog Programs. In Minker [20], pg. 659-698. 

27. O. Shmueli. Equivalence of Datalog Queries is Undecidable. Journal of Logic Programming, 
15(3):23 1-242, 1993. 

28. H. Turner. Strong Equivalence for Logic Programs and Default Theories (Made Easy). In 
Proc. LPNMR-01, LNCS 2173, pg. 81-92, Springer, 2001. 

29. H. Turner. Strong Equivalence Made Easy: Nested Expressions and Weight Constraints. 
Theory and Practice of Logic Programming, 3(4-5):602-622, 2003. 




Equivalence of Logic Programs Under Updates 



Katsumi Inoue 1 and Chiaki Sakama 2 

1 National Institute of Informatics 
2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan 
ki@nii .ac.jp 

2 Department of Computer and Communication Sciences 
Wakayama University, Sakaedani, Wakayama 640-8510, Japan 

sakamaOsys . wakayama-u. ac . jp 



Abstract. This paper defines a general framework for testing equiva- 
lence of logic programs with respect to two parameters. Given two sets of 
rules Q and TZ, two logic programs Pi and P 2 are said to be update equiv- 
alent with respect to ( Q,1Z ) if (Pi\Q)U R and (P 2 \Q)U R have the same 
answer sets for any two logic programs Q C Q and R C TZ. The notion 
of update equivalence is suitable to take program updates into account 
when two logic programs are compared. That is, the notion of relativity 
stipulates the languages of updates, and two parameters Q and TZ cor- 
respond to the languages for deletion and addition, respectively. Clearly, 
the notion of strong equivalence is a special case of update equivalence 
where Q is empty and TZ is the set of all rules in the language. In fact, 
the notion of update equivalence is strong enough to capture many other 
notions such as weak equivalence, update equivalence on common rules, 
and uniform equivalence. We also discuss computation and complexity 
of update equivalence. 



1 Introduction 

The notion of equivalence in logic programming has recently become important. 
Because a logic program is used to represent knowledge of a problem domain, we 
often have to consider whether two logic programs Pi and P 2 represent the same 
knowledge. For example, one logic program Pi may be viewed as a specification 
of knowledge in some domain, and another representation P 2 may be expected 
to be a compact form of Pi which can easily be computed. 

Strong equivalence [11] is one of the most widely recognized criteria for equiv- 
alence of logic programs. Two logic programs Pi and P 2 are said to be strongly 
equivalent if for any logic program P, Pi U R and P 2 U R have the same an- 
swer sets. On the other hand, two programs are weakly equivalent if they agree 
with their answer sets. The notion of strong equivalence was introduced earlier 
by Maher [15] for definite programs under the name of equivalence as program 
segments. Recently, strong equivalence has been studied both logically and com- 
putationally for answer set programming [11,18,17,14,21,2,4]. 

In [11], it is argued that strong equivalence can be used to simplify a part 
of a logic program without looking at the other part. For example, [p <- p] 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, pp. 174—186, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




Equivalence of Logic Programs Under Updates 175 



and 0 are strongly equivalent, so that the rule in the former set can always be 
eliminated from any program. On the other hand, the two weakly equivalent 
programs {p <— notq} and {p } are not strongly equivalent, so the rule in 
the former cannot be replaced by the rule in the latter. Hence, strong equivalence 
takes the influence of addition of a rule set to each program into account. 

However, strong equivalence cannot capture the negative influence, i.e., dele- 
tion of a rule set from each program. For example, P\ = {p 4 — , q notp} 
and P2 = {p <r- } are strongly equivalent. According to the above discussion, 
the two rules in P\ can be replaced by the rule of P2, which implies that the 
rule q <— notp can be eliminated from Pi. However, the process of program 
development is not always monotonic. We often revise and update our previous 
rules and delete some part of a program in exchange for additional new rules. 
In such a case, eliminating a rule like q <— notp is harmful, and it should be 
kept for later uses because q should be derived whenever p is removed. Strong 
equivalence does not take such a possibility of removal into account. 

In this paper, we consider a much stronger notion of equivalence which is 
tolerant of both addition and removal. Given two sets of rules Q and TZ , two 
logic programs Pl and P2 are said to be update equivalent with respect to (Q, TV) 
if (Pi \ Q) U R and (P 2 \ Q) U R have the same answer sets for any two logic 
programs Q C Q and R Q TZ. Clearly, the notion of strong equivalence is a 
special case of update equivalence where Q is empty and TZ is the set of all rules 
in the language. In the above example, Pi and P2 are not update equivalent with 
respect to (' P,P ) where V is the set of all rules in the language of Pi and P2. 
This is because the removal of p <— from both programs involves derivation of 
q in Pi. For another example, P3 = {p , q 4— p} and P4 = {p , q <r- } 
are strongly equivalent, but are not update equivalent with respect to (P, V) 
because the removal of p 4— differentiates their answer sets. Non-equivalence 
of P3 and P4 is explained as follows. While the reason why q is true in P3 is 
justified by the rule q 4— p and the truth of p, both p and q hold as facts in 
P4. So q depends on p in P 3 , and the loss of p affects the loss of q, but no 
such dependency exists in P4. This contrasts well with the case that the weakly 
equivalent programs {p notq} and {p 4— } are not strongly equivalent, 
where the truth of p is factual in the latter but it is justified by the default 
rule in the former. In other words, strong equivalence distinguishes derivation of 
literals through negation as failure from derivation without involving negation 
as failure. This asymmetric property of strong equivalence is not always natural 
from the viewpoint of nonmonotonic reasoning. 

Without any restriction on Q and TZ, two logic programs are called strongly 
update equivalent if they are update equivalent with respect to (P,P). Surpris- 
ingly, only a small class of strongly equivalent logic programs becomes strongly 
update equivalent. In fact, we prove that two logic programs are strongly update 
equivalent only if they only differ in additional valid (or tautological) rules. How- 
ever, a slightly modified definition assures that most modular transformations 
of rules proposed in the literature preserve update equivalence on common rules. 
As a related work, Eiter et al. [3] discuss another generalization of strong equiv- 




176 



K. Inoue and C. Sakama 



alence in the context of updating logic programs. However, the effect of removal 
in equivalence of logic programs has never been analyzed so far. Leite [10] intro- 
duces another update equivalence in the context of dynamic logic programming, 
but it is not a generalization of strong equivalence. 

For update equivalence, often we can restrict the languages of changing parts 
(Q,TZ) to some subsets of the whole language of programs. Such restriction 
is practicably interesting because logic programs and deductive databases are 
usually divided into invariable and variable parts such that only variable parts 
are changed in updates [19]. This notion of equivalence is partially considered in 
[14] as relative equivalence and in [2] as uniform equivalence. 

The rest of this paper is organized as follows. Section 2 reviews the answer 
set semantics and previous definitions of equivalence. Section 3 defines rela- 
tive update equivalence. Section 4 shows the necessary and sufficient condition 
of strong update equivalence, and also considers update equivalence on com- 
mon rules. Section 5 discusses the application of relative update equivalence 
to database updates. Section 6 shows a translation from relative update equiv- 
alence into relative strong equivalence, and considers the time complexity of 
update equivalence. Section 7 concludes the paper. 

2 Background 

A (logic) program is represented in a general extended disjunctive program 
(GEDP) [13,7], which consists of a finite number of rules of the form: 

5 * * * 5 Lk , not Lk+i , * * * , not L[ i Li_ j_i , . . . , L m , not L m _^i , . . . , not L n (1) 

where each Li is a literal (n > m > l > k > 0), and not is negation as failure 
(NAF). The symbol ; represents a disjunction. For any literal L, the literal 
complimentary to L is written as L, that is, when A is an atom, A = -> A and 
-i A = A. A rule with variables stands for the set of its ground instances. The left- 
hand side of the rule is the head , and the right-hand side is the body. For each rule 
r of the form (1), head + (r), head~(r ), body + (r ) and body~(r ) denote the sets 
of literals ■[Tr,...,Z// c j-, {P&- }-i , . . . , Li } , {P/-1-1 , . . . , L m y , and |_i , . . . , L n j , 

respectively. A rule r is an integrity constraint if head + (r) = 0. Any rule with 
the empty body H <— is called a fact and is also written as H without the 
symbol •<— as long as no confusion arises. A GEDP is an extended disjunctive 
program (EDP) [5] if it contains no NAF in the head of any rule (i.e., k = l). 

The semantics of a program is given by its answer sets. First, let P be a 
GEDP without NAF (i.e., k = l and m = n) and S C Lit , where Lit is the set 
of all ground literals in the language of P. Then, S is an answer set of P if S is 
a minimal set satisfying the conditions: 

1. For each ground rule r of the form L\\ - ■ • \ Li Li + i , . . . , L m from P, 
body + (r ) C S implies head + (r ) fl S ^ 0; 

2. If S contains a pair of complementary literals L and L , then S = Lit. 




Equivalence of Logic Programs Under Updates 177 



Second, given any GEDP P (with NAF) and S C Lit, consider the GEDP 
(without NAF) P s obtained as follows: a rule Iq; • • • ; P*, <— P;+i, . . . , L m is in 
P s if there is a ground rule r of the form (1) from P such that head~(r) C S 
and body~(r ) PIS' = 0. Then, S is an answer set of P if S is an answer set of P s . 

An answer set is consistent if it is not Lit. A program is consistent if it has 
a consistent answer set. An answer set S of P is minimal if there is no other 
answer set S' of P such that S' C S. Every answer set of any EDP is minimal 
[5], but the minimality of answer sets no longer holds for GEDPs [13]. The set 
of all answer sets of P is written as AS(P). 

The notions of weak and strong equivalence are defined as follows. 

Definition 2 . 1 . Let Pi, P 2 , and R be programs. 

(1) Pi and P 2 are (weakly) equivalent if A5 (Pl) = .45 (P 2 ). 

(2) Pi and P 2 are equivalent relative to R [8] if 45(Pi U R) = AS{P 2 U R). 

(3) Pi and P 2 are strongly equivalent [11] if Pl and P 2 are equivalent relative to 
any program. 

Obviously, two strongly equivalent programs are weakly equivalent, and in 
fact they are equivalent relative to 0. 

3 Relative Update Equivalence 

Relative update equivalence of logic programs is an elaboration of strong equiv- 
alence of logic programs under the two additional concepts: (a) deletion of rules 
as well as addition, and (b) the restriction of languages for deletion and addition. 

Definition 3 . 1 . Suppose that Pi, P 2 , Q, and 7Z are sets of rules with a common 
underlying language. Pi and P 2 are update equivalent with respect to ( Q,TZ ) if 
AS((Pi \Q) U R) = 45((P2 \ Q) U R) 1 for any programs Q C Q and RC1Z. 

Each rule in Q is called a removable ride, while each rule in 1Z is called an 
insertable rule. A removable or insertable rule is called an updatable rule. 

When two programs are update equivalent with respect to some pair (Q,1Z), 
they are called relatively update equivalent, or update equivalent for short. As it 
is seen by its name, (relative) update equivalence enables us to give the seman- 
tics for equivalence of logic programs with respect to updates. That is, Pi and 
P 2 are regarded as equivalent programs in the sense that they are equivalent 
after any program Q C Q is deleted and then any program R C TZ is inserted. 
Consideration of such update equivalence is meaningful and important because 

1 For removing rules containing variables from a program, the set difference operation 
is semantically defined on ground programs as P\Q = ground(P)\ground(Q) , where 
ground(P) is the ground instances of elements from P. For example, when * is a 
variable and a is a constant, {p(a)}\{p(a;)} = 0, and {p(*)}\{p(a)} = {p(y) \ y ^ a}. 
Similarly, the union P U Q and the intersection P n Q are respectively defined as 
ground(P)U ground(Q) and ground(P) D ground{Q) , e.g., {p(a)}U{p(*)} = [p(x)}. 




178 



K. Inoue and C. Sakama 



it guarantees equivalence of two different programs dynamically in the face of 
any common change of these programs. 2 

In the special case, both Q and 7Z are given as the set of all rules in the 
language of programs, that is, any rule can be either deleted or added in updates. 
This case is further investigated in Section 4. Often, however, we want to consider 
update equivalence with respect to (Q, 7Z) in which Q and 1Z are given as some 
distinguished sets of updatable rules. Such restriction to updates is practicably 
interesting because logic programs and deductive databases are usually divided 
into invariable and variable parts such that only variable parts are updatable 
[19]. Abductive logic programming [9,6] is another example where Q and 1Z are 
defined as distinguished sets of abducibles. Such applications of relative update 
equivalence are investigated in Section 5. 

A similar notion using some distinguished set of insertable rules can also be 
defined for strong equivalence. 3 

Definition 3.2. Suppose that P\ and P 2 are programs, and that 7Z is a set of 
insertable rules. Pi and P 2 are strongly equivalent with respect to 1Z if AS (Pi U 
R) = AS(P 2 U R) holds for any program RCTZ. 

4 Strong Update Equivalence 

In the general notion of relative update equivalence, we can restrict updatable 
rules to some distinguished rules in the language. In this section, we consider 
the special case of update equivalence where such restriction is not specified. 

Definition 4.1. Two programs Pi and P 2 are strongly update equivalent (S- 
update equivalent , for short) if AS ((Pi \ Q) U R) = AS((P 2 \ Q) U R) holds for 
any programs Q and R. 

For example, two programs {p 4 — p} and 0 are S-update equivalent. Obvi- 
ously, two S-update equivalent programs are strongly equivalent, and in fact they 
are equivalent for Q = 0 and for any program R. By definition, two S-update 
equivalent programs are update equivalent with respect to any pair (Q,TZ) of 
updatable rules. 



4.1 Characterization of S-Update Equivalence 

One important problem is how two strongly update equivalent programs are 
different from each other. With regard to this problem, we can show that two 
S-update equivalent programs are almost identical; the difference between the 

2 Eiter et al. [3] have also discussed a notion of update equivalence. The focuses in 
[3] are different from ours in that the semantics of updates are captured by Kripke 
structures and that Unitary characterization of updates are described. 

3 For strong equivalence, Lin [14] has also mentioned the idea of relative equivalence 
by defining equivalence between two logic programs with respect to a set of atoms. 
Our definition is more general than Lin’s as a set of rules is taken into account. 




Equivalence of Logic Programs Under Updates 179 



two always consists of valid rules. Here, a valid rule is a rule that never changes 
the answer sets of any program if the rule is added to the program. 

Definition 4.2. A rule r is valid if {r} is strongly equivalent to 0. 

Lemma 4.1. Let U be a program. IfU and 0 are strongly equivalent, then U is 
a set of valid rules. 

Lemma 4.2. Let P he a program, and V a set of valid rules. Then, P and PUV 
are S-update equivalent. 

The symmetric difference P 1 AP 2 of two programs Pi and P 2 is defined as 
P1AP2 = (Pi \ P2) U (P2 \ Pi). The next theorem shows that update equivalence 
of Pi and P 2 is determined by the validity of PiAP 2 . 

Theorem 4.3. Two programs P± and P 2 are S-update equivalent if and only if 
P 1 AP 2 is a set of valid rules. 

Proof. Suppose that Pi and P2 are S-update equivalent. Then, for any program 
R, AS((Pi\P 2 )UP) = AS((P 2 \P 2 )UP), that is, AS((P 1 \P 2 )U R) = AS(R). 
Hence, Pl \ P2 is strongly equivalent to 0. By Lemma 4.1, Pi \ P2 is a set of valid 
rules. The same argument can be applied to P2 \ Pi- 

Conversely, suppose that Pi \ P2 and P 2 \ Pi are sets of valid rules. By 
Lemma 4.2, P 2 and P2 U (Pi \ P2) are S-update equivalent, that is, P2 and 
P2UP1 are S-update equivalent. Similarly, Pi and P1UP2 are S-update equivalent. 
Therefore, P2 and Pi are S-update equivalent. □ 

The next theorem completely characterizes valid rules by their syntax. 

Theorem 4.4. A rule r of the form (1) is valid if and only if it satisfies one of 
the following: 

(i) head + (r ) fi hody + {r ) ^ 0. 

(ii) head~{r) fl body~ {r) ^ 0. 

(iii) body + (r ) fl body~(r) 0. 

(iv) head + (r)\Jbody~ (r) ^ 0 and there are two literals L 1 and L 2 in head~(r) U 
body + (r) such that L\ = L 2 . 

Proof. Let R be any program, and S any answer set of R. If a rule r satisfies one 
of (i), (ii), and (iii), then it is easy to show that AS(R s ) = 45((i?U{r}) s ). By 
S G AS{R s ), S is also an answer set of R U {r}. Next, suppose that r satisfies 
(iv) . Let Li and L 2 be a complementary pair of literals in the condition (iv) . 

(I) Firstly consider the case that S is consistent. (I-a) If Li,L 2 G body + (r), 
then body + {r) % S because S is consistent. So no literal is derived through 
the rule in {r} s . (I-b) If Li,L 2 G head~(r), then head~(r ) $Z S because S is 
consistent. Then {r} s = 0. (I-c) If Li G body + (r ) and L 2 G head~(r), then 
either (c-1) L x ^ S and L 2 ^ S , or (c-2) L 1 G S and L 2 ^ S, or (c-3) Li 
S and L 2 G S. If (c-1) or (c-2), head~(r ) % S and thus {r} 5 = 0. If (c-3), 
body + (r) (Z S and hence no literal is derived through the rule in {r} s . In either 
case, AS(R s ) = AS((P U (r}) s ) holds. 




180 



K. Inoue and C. Sakama 



(II) Secondly suppose that S = Lit. In this case, head~(r) U body + (r ) C 
S, that is, head~(r ) C S and body + (r) C S. On the other hand, head + (r ) U 
body~(r ) ^ 0 implies that either (Il-a) head + (r ) ^ 0 or (Il-b) body~(r) ^ 0 
holds. If (II-a), ?’ is satisfied by S, that is, body + (r ) C S' implies head + (r)C\S ^ 0. 
If (H-b), {r} s = 0 holds. In either case, AS(R s ) = AS((RU {r}) s ) holds. Thus, 
S = Lit is also an answer set of R U {r}. Hence, r is valid. 

Conversely, suppose that r satisfies none of (i), (ii), (iii), and (iv). Then, 

head + (r) fl body + (r) = 0, (2) 

head~{r) fl body~{r ) = 0, (3) 

body + {r ) fl body~(r) = 0, (4) 

and either 

head + (r ) U body~(r) = 0 (5) 

or there is no complementary pair of literals L\ and L 2 in head~ (r) U body + (r). 
Let S = head~{r) U body + (r). 

(I) Suppose that S is consistent. In this case, there is no complementary 
literals L\ and L 2 in S = head~{r) U body + (r). Then, body~{r) fl S = 0 by (3) 
and (4). Also by head~(r) C S, {r} s ^ 0 holds. By body + (r) C S, there is 
a literal L G head + (r) such that L S by (2). Now, let R be the program 
exactly consisting of the facts of S', i.e., R = S. Obviously, R s = S. Then, 
L G S' for some answer set S' G AS({r} s U R), while L ^ S for the answer set 
S G AS(0 U R). Hence, {r} and 0 are not strongly equivalent. 

(II) Suppose that S is inconsistent. In this case, there is a pair of complemen- 

tary literals L\ and L 2 in head~(r ) U body + (r). Then, head + (r) U body~{r ) = 0 
holds by (5), that is, head + (r ) = body~(r ) = 0. Thus the rule r is an integrity 
constraint with no NAF in the body. Again, let R = S. Then, the unique answer 
set of R is Lit. However, Lit. is not an answer set of R U {r} L ' lt because Lit 
cannot satisfy the rule of {r} Lzt so that there is no answer set of I?U{r}. Hence, 
{r} and 0 are not strongly equivalent. By (I) and (II), r is not valid. □ 

The conditions (i) and (iii) in Theorem 4.4 are considered in the context of 
EDPs without classical negation in [1], in which rules satisfying (i) and (iii) are 
called tautologies and contradictions , respectively. The condition (ii) is similar 
to (i) and is meaningful only for the class of GEDPs with NAF in heads. The 
condition (iv) is necessary for extended programs with classical negation. In 
other words, (i), (ii) or (iii) is the necessary and sufficient condition for a rule 
in a program without classical negation to be valid. According to Theorem 4.4, 
the following rules are all valid: 

p 4 — p, q, q <— p, notp , q ; notp ->p, •<— p , ->p, not.q. 

However, the following conditions are excluded from the definition of valid rules. 

(v) head + {r) fl head~{r) ^ 0. 

(vi) there are two literals L\ and L 2 in head + {r) such that L\ = L 2 . 




Equivalence of Logic Programs Under Updates 181 



(vii) head + (r) = body (r) = 0 and there are two literals L i and L 2 in 
head~ (r) U body + (r) such that Li = L 2 . 

An example for the case (v) is p\ notp <— , which has two answer sets 0 and {p}. 
Similarly, for the case (vi), p\ ->p ■<— has two answer sets {p} and {~<p}. For the 
case (vii), the integrity constraint <— p,~>p eliminates the answer set Lit if it is 
added to { p , —>p ■*— } as in the proof of Theorem 4.4. 

4.2 Update Equivalence on Common Rules 

Theorem 4.3 implies that only valid rules can be safely eliminated from a program 
under the situation that any update can occur. This is a rather unexpected result. 
In fact, we cannot even show S-update equivalence of {p\p 4— q} and {p q}, 
although the latter rule is just a merged form of the former. The main reason 
why such two programs P\ and P 2 are not S-update equivalent is that, when a 
rule r\ in P\ \ P 2 is removed, rq is syntactically different from the semantically 
equivalent rule r 2 in P 2 \ P±. Thus, removing Q = {iq} from Pi and P 2 results 
in elimination of ri from Pi while r 2 remains in P 2 . 

A rational solution is to exclude removal of rules from PiAP 2 in testing 
update equivalence. In other words, a removal-addition pair ( Q , R) is considered 
for any program R and any set Q of rules from Pi n P 2 . 

Definition 4.3. Two programs Pi and P 2 are update equivalent on common 
rules ( C-update equivalent, for short) if AS ((Pi \ Q) U R) = AS((P 2 \Q) U R) 
holds for any pair ( Q , R) of programs such that Q consists of rules from Pi fl P 2 
and R is any set of rules. 

Update equivalence on common rules is a restricted version of S-update equiv- 
alence, and hence update equivalence implies C-update equivalence. Many im- 
portant program transformations proposed in the literature preserve C-update 
equivalence. For example, the previous program {p\p 4— q} is C-update equiv- 
alent to its merged form {p 4 — q }. In [7], it is shown that any rule of the form 
not Li, • • • , not Li 4 L/- 11 , . . . , L m , not L m . |_i, • • • , not L n 

can be transformed to an integrity constraint of the form 
i Li , ■ . • , Li, L[+i ,, . . . , , not L rn j r i , , not L n 

without changing the answer sets. Such a modular transformation preserves C- 
update equivalence as well as strong equivalence [11]. In fact, whenever PiflP 2 is 
empty, Pi and P 2 are C-update equivalent if and only if Pi and P 2 are strongly 
equivalent. The next theorem generalizes this fact. 

Theorem 4.5. Two programs Pi and P 2 are C-update equivalent if and only if 
Pl \ Pn and P 2 \ Pi are strongly equivalent. 

Proof. Let P = Pi fl P 2 . If Pi and P 2 are C-update equivalent, then AS((Pl \ 
P) U R) = AS((P 2 \ P) U R) holds for any program R. Then, Pi \ P and P 2 \P 
are strongly equivalent. That is, Pi \ P 2 and P 2 \ Pi are strongly equivalent. 

Conversely, suppose that Pi\P 2 and P 2 \Pi are strongly equivalent. Then, Pi\ 
P and P 2 \P are strongly equivalent. That is, AS((Pl\P)UP) = AS((P 2 \P)UR) 




182 



K. Inoue and C. Sakama 



holds for any program R. Here, R = (PnP)U(P\P). Let Q be the program such 
that RHP = P\Q , and R' be the program R\P. Then, (Pj\P)UP = ( Pi\Q)UR ' 
holds for * = 1,2. Hence, AS((Pl \Q) U R') = AS ( ( Pi \ Q) U R!) holds. Since 
R is any program, Q can be any subset of P and R' can also be any program. 
This implies that P\ and P2 are C-update equivalent. □ 

On the other hand, an unfold/fold transformation [20] does not preserve C- 
update equivalence. For example, {p <r- q, q <— r} and {p r, q 4 — r} 
are weakly equivalent, but are not even strongly equivalent because the addition 
of q causes the truth of p in the former only. In Section 1, we have seen that 
{p 4— , q 4— notp} is strongly equivalent to {p <— }. However, these two 
programs are not C-update equivalent because removing the common p 4— 
derives q in the former. Although it is claimed that strong equivalence allows us 
to replace the former rules with the latter, we regard that such a transformation 
is not tolerant of program updates. 4 Hence, C-update equivalence gives us a 
better criterion of program transformation than strong equivalence under the 
situation that updates may occur. 

The relationship between several notions of equivalence in logic programs can 
be summarized in the form of relative update equivalence as follows. 

Proposition 4.6. Let Pi and P 2 be programs such that P\ C V and P 2 C V 
where V is the set of all rules in the language of Pi and P 2 ■ 

(1) Pi and P 2 are S-update equivalent iff they are update equivalent wrt (' P,V ). 
iff they are update equivalent wrt (Pi U P 2 ,P). 

(2) Pi and P 2 are C-update equivalent iff they are update equivalent wrt (Pi fl 

P2,V). 

(3) Pi and P 2 are strongly equivalent iff they are update equivalent wrt (0,P). 

(4) Pi and P 2 are weakly equivalent iff they are update equivalent wrt (0,0). 

Note in Proposition 4.6 (1) that the removal rules in S-update equivalence 
can be set to the union of two given programs, Pi UP 2 , rather than the set of all 
rules V in the language. This is because any rule in V \ (Pi U P 2 ) has no effect 
if it is removed from either Pi or P 2 , that is, both Pi and P 2 are unchanged by 
such a removal. 

5 Uniform Equivalence 

In database updates, updates are permitted only on variable data. Representing a 
database as a logic program P, P is usually divided into two parts: P = Int(P) U 
Ext(P), where Int(P) fl Ext(P) = 0. Here, Ext(P) denotes the set of facts in 
P called an extensional database , and the set of non-facts Int(P) = P \ Ext(P) 
is called an intensional database. In databases, Ext(P) can be considered as 
variable data while Int(P) is regarded as invariable knowledge. Similarly, the 

4 Unlike strong equivalence, RED - , NONMIN, WGPPE and S-IMP in [4] fail to pre- 
serve C-update equivalence (thereby, S-update equivalence). 




Equivalence of Logic Programs Under Updates 183 



set of all literals in the language is divided into the extensional literals £ and 
the intensional literals I as: Lit = 1U£, where L (~1 £ = 0. Here, I is the set of 
all literals with the predicates appearing in heads of Int(P), and £ is the set of 
all other literals. Then, two databases Pi and P 2 are equivalent in the sense of 
Sagiv [19] if Pi and P 2 are strongly equivalent with respect to £. 

Sagiv [19] also considers uniform equivalence of two Datalog programs which 
can be defined as follows. Two programs Pi and P 2 are uniformly equivalent if 
the output of Pl agrees with that of P 2 for any input from Lit = X U £, where 
the output of P is defined as { S fl X \ S £ AS{P UP), PC Lit }. 

Uniform equivalence implies Sagiv’s equivalence. In fact, uniform equivalence 
takes an input literal set P not only from the extensional part £ but also from the 
intensional one L. Since it is obvious that the extensional part in each answer 
set S, i.e. , £ fl S, is always the same between the two, it turns out that two 
programs are uniformly equivalent if and only if they are strongly equivalent 
with respect to Lit. Sagiv uses the notion of uniform equivalence for minimizing 
Datalog programs. Eiter and Fink [2] consider uniform equivalence for normal 
and extended disjunctive programs. The notion of (uniform) equivalence can also 
be generalized to update equivalence as follows. 

Definition 5.1. Let P± and P 2 be programs. Suppose that I and £ are the sets 
of intentional and extensional literals, respectively, which are common to both Pi 
and P 2 . Then, Pi and P 2 are extensionally update equivalent if they are update 
equivalent with respect to {£,£)■ On the other hand, Pi and P 2 are uniformly 
update equivalent if they are update equivalent with respect to {Lit, Lit). 



Example 5.1. Suppose that two databases Pi and P 2 are given as 

P\ = {p •<— g, Q -S— noth, b }, 

P 2 = { p <— a, not b, q <— not b, b -s— }, 

where £ = {a, b} and X = {p,q}. Then, Pi and P 2 are extensionally update 
equivalent, but are not uniformly update equivalent. In fact, Pi and P 2 are 
update equivalent with respect to {£,£), but are not with respect to {Lit, Lit) 
because AS{P\ U {a, q}) = {{a, b,p, g}} while AS{P 2 U {a, g}) = {{a, b, g}}. 

A database P is disjunctive if disjunctions appear in P. Usually, Ext{P) 
also contains disjunctive facts in disjunctive databases. Let V{£) be the set of 
all disjunctions of literals from £. Then, disjunctively update equivalence of two 
databases is defined as update equivalence with respect to {V {£),£>{£)). This 
can also be represented by the notion of disjunctive explanations in [8] . 

Often, updates on the invariable part are translated into updates on the 
variable part in databases. This type of updates is called view updates. The view 
update problem in databases is concerned with the problem of translating an 
update request on intentional literals in X into updates on extensional literals 
£. This problem can be characterized by extended abduction [6], and we will 
consider equivalence with respect to abductive updates in another paper. 




184 



K. Inoue and C. Sakama 



6 Computation and Complexity 

This section considers the computational aspects of update equivalence. 

We first show a translation of relative update equivalence into relative strong 
equivalence, which is similar to the transformation proposed in [6,8]. 

Given two tested programs P\ and P 2 and the updatable rules Q and 1Z, 
we convert the update equivalence problem (Pi, P 2 , Q, 1Z) into the strong equiv- 
alence problem (ivi,/v 2 ,/C) = (iz(Pl), ^(P 2 ),/r(<2) UP), where Ki and iv 2 are 
programs and K, is a set of insertable rules. To this end, any removable rule r 
in Q is associated with a unique literal 5 r (the name of r) through negation as 
failure as notS r . In this way, the deletion of r is realized by the addition of S r 
to the program. Then, the translations v and p are defined as: 

v{Pi) = (Pi \ Q) u { (H <- P, notSr) I r=(H^B)eP l nQ}, (i = 1, 2) 
h(Q ) = {S r \r G Q}. 

Theorem 6.1. Suppose that (Pl, P 2 , <2, 1Z) is converted to (K\, K 2 , 1C) as above. 
Pl and P 2 are update equivalent with respect to ( Q,7Z ) if and only if K\ and i\ 2 
are strongly equivalent with respect to 1C. 

Notice that the translation v is modular. Without loss of generality, we can 
assume that the removable rules Q are finite and are included in Pi U P 2 ; if 
Q % PiUP 2 , we can substitute Q with Qfl(PiUP 2 ) without changing the result 
of equivalence testing. See also Proposition 4.6 (1). Then, the translations v and 
/i can always be computed in linear time. 

Example 6.1. Two programs {p <— notq, <— q} and {p 4 — , •<—<?} are 
strongly equivalent. However, they are neither S-update equivalent nor C-update 
equivalent. In fact, removing the common constraint <— q and adding q <— cause 
the deletion of p in the former only. Let us verify this fact. The constraint <— q 
is converted to q, not5<- q , so the addition of q} to both converted 

programs causes the same effect. On the other hand, if the constraint <— q 
as well as other rules are not removable, no rule is converted, and hence they 
become update equivalent with respect to (0,P) for any 1Z , that is, they are 
(relatively) strongly equivalent. 

Theorem 6.1 reduces testing relative update equivalence to testing rela- 
tive strong equivalence. At the moment, however, no sophisticated procedure 
is known for testing relative strong equivalence, 5 although some useful methods 
exist for testing non-relative strong equivalence [18,14,21]. In fact, we can show 
that update equivalence in general is harder than strong equivalence as follows. 

To establish the computational complexity of relative update equivalence of 
propositional programs, we can use Proposition 4.6 (4) and the result by Turner 
[21] that deciding weak equivalence of two GEDPs is IPf -hard. 

5 Woltran [22] recently showed that strong/ uniform equivalence wrt all rules over 
some alphabet can be reduced to weak equivalence. We can also show that strong 
equivalence wrt a finite set of rules can be reduced to weak equivalence. 




Equivalence of Logic Programs Under Updates 185 



Theorem 6.2. The problem of checking relative update equivalence of two 
propositional programs is II ff -hard in general. 

By contrast, checking S-update equivalence of two programs Pi and P 2 can 
be done in polynomial time by Theorems 4.3 and 4.4. That is, we check whether 
each rule in P 1 AP 2 is valid or not. If every rule in P 1 AP 2 is valid, Pi and P 2 
are S-update equivalent; otherwise, they are not S-update equivalent. 

Theorem 6.3. S-update equivalence of two propositional programs can be de- 
cided in polynomial time. 

Now, we compare the notions of weak, strong, and update equivalence in 
the non-relative versions from the complexity viewpoint. In [21], it is shown 
that deciding strong equivalence of two GEDPs is in coNP and that deciding 
weak equivalence of two GEDPs is Til 3 -hard. Hence, the strength of S-update 
equivalence is reflected in the time complexity of the respective decision prob- 
lems: unless the polynomial hierarchy collapses, deciding S-update equivalence 
is easier than deciding strong equivalence, which in turn is easier than weak 
equivalence. 

Although the notion of S-update equivalence seems too strong to be practical, 
C-update equivalence is more attractive. In fact, Theorem 4.5 indicates that C- 
update equivalence is much closer to strong equivalence. 

Theorem 6.4. The problem of checking C-update equivalence of two proposi- 
tional programs is coNP-complete. 



7 Conclusion 

We have proposed update equivalence in logic programming, investigated its 
properties, considered several variants, and presented their applications. We have 
completely characterized each case of update equivalence. Although the condi- 
tion for S-update equivalence is very strong, one for C-update equivalence is 
rather practical. We have also shown that most previously proposed notions of 
equivalence in logic programming and deductive databases can be characterized 
by relative update equivalence. The notion of update equivalence can thus be 
used to guarantee the correctness of a program transformation in a dynamic 
setting, and is helpful to optimize logic programs for various applications. 

We can consider more general form of programs allowing nested expressions 
[12]. There are some formalizations of strong equivalence of two nested logic 
programs in non-standard logics [18,17,21]. While we have shown that relative 
update equivalence can be converted to relative strong equivalence, more direct 
connections between these logics and update equivalence are also worth investi- 
gating. Another future work is to characterize many transformation techniques 
in logic programming in terms of subclasses of relative update equivalence. New 
transformations preserving relative update equivalence should also be developed. 




186 



K. Inoue and C. Sakama 



References 

1. S. Brass and J. Dix. Characterization of the disjunctive stable semantics by partial 
evaluation. Journal of Logic Programming , 32(3):207-228, 1997. 

2. T. Eiter and M. Fink. Uniform equivalence of logic programs under the stable 
model semantics. In: Proc. of ICLP 2003, LNCS 2916, pp. 224-238, Springer, 
2003. 

3. T. Eiter, M. Fink, G. Sabbatini, and H. Tompits. Reasoning about evolving non- 
monotonic knowledge bases. In: Proc. of LPAR 2001, LNAI 2250, pp. 407-421, 
Springer, 2001. 

4. T. Eiter, M. Fink, H. Tompits, and S. Woltran. Simplifying logic programs under 
uniform and strong equivalence. In: Proc. of LPN MR 200 4, LNAI 2923, pp. 87-99, 
Springer, 2004. 

5. M. Gelfond and V. Lifschitz. Classical negation in logic programs and disjunctive 
databases. New Generation Computing , 9:365-385, 1991. 

6. K. Inoue. A simple characterization of extended abduction. In Proc. of the 1st In- 
ternational Conference on Computational Logic, LNAI 1861, pp. 718-732, Springer, 
2000 . 

7. K. Inoue and C. Sakama. Negation as failure in the head. Journal of Logic Pro- 
gramming, 35( 1) :39— 78, 1998. 

8. K. Inoue and C. Sakama. Disjunctive explanations. In: Proc. of ICLP 2002, LNCS 
2401, pp. 317-332, Springer, 2002. 

9. A. C. Kakas, R. A. Kowalski, and F. Toni. The role of abduction in logic program- 
ming. In: D. M. Gabbay, C. J. Hogger and J. A. Robinson (eds.), Handbook of Logic 
in Artificial Intelligence and Logic Programming, volume 5, pp. 235-324, Oxford 
University Press, 1998. 

10. J. A. Leite. Evolving Knowledge Bases. IOS Press, 2003. 

11. V. Lifschitz, D. Pearce, and A. Valverde. Strongly equivalent logic programs. ACM 
Transactions on Computational Logic, 2:526-541, 2001. 

12. V. Lifschitz, L. R. Tang, and H. Turner. Nested expressions in logic programs. 
Annals of Mathematics and Artificial Intelligence, 25:369-389, 1999. 

13. V. Lifschitz and T. Y. C. Woo. Answer sets in general nonmonotonic reasoning 
(preliminary report). In: Proc. of KR ’92, pp. 603-614, Morgan Kaufmann, 1992. 

14. F. Lin. Reducing strong equivalence of logic programs to entailment in classical 
propositional logic. In: Proc. of KR 2002, pp. 170-176, Morgan Kaufmann, 2002. 

15. M. J. Maher. Equivalence of logic programs. In: [16], pp. 627-658, 1988. 

16. J. Minker (ed.). Foundations of Deductive Databases and Logic Programming. 
Morgan Kaufmann, 1988. 

17. M. Osorio, J. A. Navarro, and J. Arrazola. Equivalence in answer set programming. 
In: Proc. of LOPSTR 2001, LNCS 2372, pp. 57-75, Springer, 2001. 

18. D. Pearce, H. Tompits, and S. Woltran. Encodings for equilibrium logic and logic 
programs with nested expressions. In: Proc. of EPIA 2001, LNCS 2258, pp. 306- 
320, Springer, 2001. 

19. Y. Sagiv. Optimizing Datalog programs. In: [16], pp. 659-668, 1988. 

20. H. Tamaki and T. Sato. Unfold/fold transformation of logic programs. In: Proc. 
of the 2nd International Conference on Logic Programming, pp. 127-138, 1984. 

21. H. Turner. Strong equivalence made easy: nested expressions and weight con- 
straints. Theory and Practice of Logic Programming, 3(4-5):609-622, 2003. 

22. S. Woltran. Characterizations for relativized notions of equivalence in answer set 
programming. In: Proc. of JELIA ’Of, LNAI, this volume, 2004. 




Cardinality Constraint Programs 



Tommi Syrjanen* 

Helsinki University of Technology, Dept, of Computer Science and Eng., 
Laboratory for Theoretical Computer Science, 

P.O.Box 5400, FIN-02015 HUT, Finland 
Tommi . Syr j anenOhut . f i 



Abstract. We define the class of cardinality constraint logic programs 
and provide a formal stable model semantics for them. The class extends 
normal logic programs by allowing the use of cardinality constraints and 
conditional literals. We identify a decidable subset, omega-restricted pro- 
grams, of the class. We show how the formal semantics can be extended 
to allow the use of evaluated function symbols, such as arithmetic built- 
in operators. The omega-restricted cardinality constraint programs have 
been implemented in the Smodels system. 



1 Introduction 

When we use Answer Set Programming (ASP) to solve a problem, we encode it 
using some logical framework so that the solutions correspond with the models 
of the system, and then use a solver of the chosen formalism to find them. 
What differentiates ASP from the traditional logic programming is that an ASP 
solution is a set of literals instead of a proof tree of Prolog and its derivatives. 

Most ASP systems use the stable model semantics [6] of logic programs as 
their underlying semantics ([12,4,1]) but systems based on propositional logic 
([5]) also exist. Most logic program -based systems have some extensions to the 
basic inference rules. For example, dlv [4] allows disjunctive rules and Smod- 
els [12] has cardinality and weight constraint literals. 

The usual way to define a semantics for ASP is to start with variable-free 
(ground) programs and then say that the variables are just short-hand constructs 
for denoting sets of rules. This approach has the advantage that you do not have 
to alter the basic semantics to use variables and in most cases it is easy to see 
what set of ground rules correspond to a rule with variables. However, the things 
get more complex when we allow the use of function symbols, especially when 
some of function symbols are built-ins that should be evaluated (like +, — ) and 
some are uninterpreted Herbrand terms. 

In this work we examine in detail how formal stable model semantics can be 
defined for non-ground logic programs with cardinality constraints and condi- 
tional literals, and explain the motivation why the definitions are done the way 
they are. We do it in four steps: 1) we define the semantics for ground programs; 

* This work has been supported by the Academy of Finland (project 53695). 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, pp. 187-199, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




188 



T. Syrj anen 



2) add variables to them; 3) define some syntactic sugar to make the language 
easier to use; and 4) identify a decidable subset that corresponds to the language 
used in the Smodels system. 1 Finally, we show how interpreted built-in func- 
tions fit into the formal semantics. This work is based on [11] and extends [10]. 
Non-ground cardinality and weight constraints have been examined also earlier 
([9,8]). This work has two major differences from the precious work: (1) we allow 
the default negation of cardinality constraint literals; and (2) the treatment of 
conditional literals is now more precise. 

In the language definition we identify basic primitives and the more complex 
constructs are then translated into programs that use only those primitives. The 
main criteria that we use in selecting the primitives are: 

1. A program with variables can always be replaced by a ground program that 
has the same set stable models; 

2. When an extended program is translated into basic primitives no new atoms 
should be generated; 

3. It should be possible to translate a complex program with variables without 
having to instantiate the program first; and 

4. All translations should be linear in size. 

The first criterion means that we keep the standard interpretation of vari- 
ables as short-hand notation for sets of ground rules as this makes defining the 
semantics more convenient. If function symbols are used, then the correspond- 
ing ground program is infinite. The second criterion is included since we want 
to keep a very close link between the original extended program and the trans- 
lated simple one. It also allows us to manipulate the programs by combining or 
splitting them without having to worry about possible clashes introduced by the 
new atoms. The third criterion is connected to the first one with the idea being 
that we can always see the intended meaning of the program without having to 
create the potentially infinite instantiation. A translation is linear if the number 
of new literals and rules is some constant times the size of the original construct. 

2 Language 

A term is either a variable or a function term f(ti , ... ,t-k) where / is a fc-ary 
function symbol and t\, . . . , tk are terms. A 0-ary function symbol is a constant. 
A term is ground if it does not contain any variables. 

An atom is of the form p(ti , ... ,tk) where p is a k- ary predicate symbol and 
<i, . . . , tk are terms. An atom is ground iff all terms in it are ground. In the 
following we use A to denote an atom if we are not interested in its arguments, 
and pred(A) to denote its predicate symbol. The symbol T denotes a special atom 
that is always true. A basic literal is either an atom A (positive) or its negation 
not A (negative). We use L to denote a basic literal and L its complement when 
their arguments are not relevant. A conditional literal L c is of the form: 

X.L : A (1) 

1 However, the current version of Smodels recognizes a slightly simpler language. 




Cardinality Constraint Programs 189 



where the main literal L is a basic literal, the condition A is an atom, and X 
is a set of local variables. If X = 0, we denote (1) simply as L : A, and if X is 
a singleton set we write it without braces. All variables that occur in L c that 
are not local are global. Intuitively L : A can be seen as a conjunction that is 
evaluated in two phases: first A is checked, and if it is true, then the truth value 
of L determines the truth value of the whole construct. A conditional literal is 
positive if L is, and it is negative otherwise. A literal is either a basic literal or 
a conditional literal. We use the notation C to denote literals. 

A cardinality constraint C is of the form: 

C = Card (6, S) (2) 

where b is an integral bound and S' is a set of literals. The basic intuition is that C 
is true if the number of true literals C G S is greater than or equal to b . 2 The 
set of positive literals in S is denoted by pos(C') and the set of negative literals 
by neg(C). A cardinality constraint literal C is either a cardinality constraint C 
or its negation not C. 

A basic rule is of the form: 



A <— C i , . . . ,C n (3) 

where the head A is an atom, and C, in its body are cardinality constraint literals. 
The rule (3) encodes the fact that if all literals in the body are true, then the 
head must also be true. If the body is empty (n = 0), then we call a basic rule 
a fact. As above, pos(i?) and neg(-R) denote the sets of positive and negative 
cardinality constraint literals in the rule body. 

We immediately define a two shorthand notations that are used in our exam- 
ples: 1) a basic literal L in a rule body denotes the constraint literal Card(l, { L }); 
and 2) an empty head is replaced by a new atom / that does not occur anywhere 
in the program and a rule f •<— /, not f that causes a contradiction if / is true 
to the program. Thus, such a rule acts as a constraint on the models and a model 
candidate that makes its body true is rejected. 

A choice rule has the form: 



{A} Ci, ... ,C n (4) 

where A and Ci are defined as above. The intuition is that if the rule body is 
true, then A may be true but it does not have to be true. A cardinality constraint 
program is a set of rules. 

For each formula (term, literal, cardinality constraint, or rule) F, Var (F) 
denotes the set of variables that occur in it. 

These definitions conclude our basic language. In Section 4 we define several 
extended language constructs that are then translated to the basic language. 

2 In Smodels syntax this written as b S 




190 



T. Syrj anen 



3 Stable Model Semantics 

3.1 Ground Programs 

In this section we define the stable model semantics for the basic cardinality con- 
straint programs. We do the definition in stages, starting from the simplest pos- 
sible programs and then extend the definitions to cover the full basic language. 
In the definition we will be using the notation M |= F to denote that a set of 
ground atoms M satisfies the formula F. In particular, for an atom A, M \= A iff 
A £ M and M not A iff A ^ M. In case of a cardinality constraint Card(6, S) 
where S contains only basic literals, M \= Card(6, S) iff b < \{L £ S \ M |= L}\, 
and M \= not Card(6, S) iff M Card(6, S). 

In the simplest case the rules have only basic literals and positive cardinality 
constraint literals in their bodies. We call these simple rules. They are essentially 
equivalent to the extended programs presented in [8]. 

Definition 1. Let C = Card(6, S) be a ground cardinality constraint and M be 
a set of ground atoms. Then, the reduct C M is the cardinality constraint: 

C M = Card(6',pos(C)) (5) 

where b' = b — \ {L £ neg(C) | M \= L}\ . The reduct R M of a ground simple basic 
rule R = A 4— Ci, . . . ,C n is the singleton set of rules: 

RM = {A^C™, (6) 

and the reduct of a ground simple choice rule R = {A} 4— Ci, . . . ,C n is the set 
of rules: 

rM= Ua^CM,...,C™}, if Ag M (7) 

[ 0, otherwise . 

A reduct of a set P of ground simple rules is the set: 

P M = (J R m . (8) 

fieP 

Example 1. Consider the rule R = {a} Card(2, {b, not c}) and let M = {a}. 
Then, R m = {a 4- Card(l, {&})}. 

Note that all rules that belong to a reduct of a program P are basic rules and 
all basic literals that occur in them are positive. Such rules are monotonous [8] 
so the reduct P M has a unique least model that we denote with MM(P m ). The 
least model is the least fixpoint of the operator Tp where Tp(S) = {A \ A «— 
Cf 1 , . . . , Cff £ P M and S \= Cf 1 , . . . , Cff} [8]. If this least model happens to 
coincide with M, then M is a stable model of P. 

Definition 2. Let P be a ground simple cardinality constraint program. A set 
of ground atoms M is a stable model of P if and only if: 

MM(P m ) = M . (9) 




Cardinality Constraint Programs 191 



Next, we extend the semantics by allowing ground conditional literals as well as 
negative cardinality constraints. We add an extra step, expansion, that is done 
before reduction and at this point all conditional literals L : A are either replaced 
by L if A is true or removed altogether if A is false. 

Definition 3. Let M be a ground set of atoms. Then, the expansion of a ground 
basic literal L with respect to M is E(L, M) = {L}, and the expansion of a ground 
conditional literal L c = L : A is 



E {L c , M ) 



{L}, Ae M 
0, otherwise . 



(10) 



The expansion of a ground cardinality constraint Card(&, S) with respect to M 
is the cardinality constraint: 



Card(6. (J E(£, M)) . (11) 

C&S 



Example 2. Let C = Card (1, {a : T,not b,c : d,e : /}) and M = {d}. Then, 
E(C, M) = Card(l, {a, not b, c}). 

We obtain the reduct of a ground rule by first expanding all constraint literals 
in its body, and then removing the negative constraints from its body in a way 
that is analogous to the original Gelfond-Lifschitz -reduction [6]. 

Definition 4. Let R = A <— C \ , . . . , C n , not C [ , . . . , not C' m be a basic car- 
dinality constraint rule and M a set of ground atoms. Then, the reduct R M 
is: 

( {A E(Ci, M) M , . . . , E(C„, M) m }, M |= E(C(, M) M , . . . , 
R M =l E (12) 

[0, otherwise . 

Let R = {A} ■«— Ci, . . . ,C n , not C[, . . . , not C' m be a choice rule. Then, 

{A E(Ci, M) M , . . . , E(C„, M) m }, M |= E(C(, M) M , . . . , 

E (C’ m ,M) M 
and A £ M 

0, otherwise . 

As was the case with the simple programs, M is a stable model of P if and only 
if P = MM(P m ). 

Example 3. Let P be the program: 

{a} not Card(l, {b : a}) 

{b} <r- Card(l, {not a}) 

Now P has three stable models: M\ = 0, M 2 = {a}, and M 3 = {b}. In the 
case of Ali, the reduct P M = {a } since M 2 |= not Card(l, {b : a}) M = 
not Card(l, {6}). 





192 



T. Syrj anen 



3.2 Programs with Variables 

A rule with variables denotes the set of ground rules that can be obtained by 
replacing each variable by terms of the Herbrand universe of the program. As 
there are two types of variables, local and global, the instantiation is defined in 
two parts. First, the local variables in conditional literals are replaced by their 
instantiations, and then the same is done for the global variables. 

The Herbrand universe HU(P) of a cardinality constraint program P is the 
set of all ground terms that can be constructed using the function terms that 
occur in P. 



Definition 5. A substitution is a function ay,u '■ V — > U that maps a set of 
variables V to an universe U. The set of all substitutions from V to U is denoted 
by Sub(V, U). 

A substitution applied to a variable v is the term: 



va VtU = 



<?v,u(v), 

V, 



v£V 

otherwise, 



(14) 



and a substitution applied to a function term t = /(fi,... ■ t n ) is the term 
ta = f(t±a , . . . , t n a). For atoms and literals the substitution is applied similarly 
to the case of function terms, that is, their arguments are substituted. 

Definition 6. Let P be a program and L c = X.L : A be a conditional literal 
that occurs in it. Then, the local instantiation I(L C ,P) of L c is the set: 



I(L C , P) = {La : Aa \ a G Sub(X, HU(P))} . (15) 

The local instantiation of a basic literal L is the set I(L,P) = {L}, and the 
local instantiation of a cardinality constraint C = Card(6, S ) is the constraint 
I(C,P) = Ca,rd(b,{J CeP I(£,P)). 



Definition 7. The instantiation of a ride R = A C\, . . . ,C n is the set of 
rules: 

HI (R,P) = {Aa 4- I (Ci , P)a, . . . ,I(C n ,P)a \ a G Sub(Var(P), HU(P))} 

(16) 

The Herbrand instantiation of a program P is the set of rules: 

HI(P)= U HI(P,P) . (17) 

RGP 



Example 4- Consider the following encoding H of the Hamilton cycle problem: 

{hc{X,Y)} g- arc{X, Y) 

<r- Card(2, {Y.hc(X, Y) : arc(X, Y)}), vtx(X) 
r{Y) G- Card(l, { start(X ), r(A)}), hc(X , Y), arc(X , Y) 

G- 'ute(A'), not r(X). 




Cardinality Constraint Programs 193 




vt:r(a) . a.rc(a,b ). 
vtx(b). arc(a.c). 
vtx(c). a.rc(b,c). 
start(ti) . arc(c . , a) . 



Fig. 1 . Sample graph for the Hamilton cycle problem 



The first rule asserts that each arc of the graph may belong in the Hamiltonian 
cycle, and the second one ensures that at most one arc may leave from any vertex 
of the graph. The third rule computes the the set of visited vertices starting from 
an initial vertex, and the last rule ensures that all vertices belong to the cycle. 

Next we add to the program the datacorresponding to the graph that is 
presented in Figure 1. Then, the local instantiations of the constraint literal in 
the second rule is: {hc(X, a) : arc(X, a), hc(X, b) : arc(X, b),hc(X , b) : arc(X, b)}, 
so the instantiation of the rule is: 

+- Card(2, {hc(a, a) : arc(a,a),hc(a,b) : arc(a,b)Jic(a,c) : arc(a,c)}),vtx(a) 

+- Card(2, {hc{b, a) : arc(b, a), hc(b, b) : arc(b, b), hc(b, c) : arc(b, c)}), vtx(b) 

<— Card(2, {hc(c, a) : arc(c, a), hc(c, b ) : arc(c, 6), hc(c, c ) : arc(c, c)}), vtx{c ) . 

4 The Extended Language 

In this section we define several additional syntactic constructs for cardinality 
constraint programs. They are seen as notational shortcuts for larger basic lan- 
guage constructs. We have found that these extensions are useful in practice for 
modeling several different problem domains. 



Upper Bounds for Constraints. We allow a cardinality constraint to have 
also an upper bound u. The intuition is that a constraint C u = Card (b,u,S) 
is true iff the number of satisfied literals in it is between the bounds, inclusive. 
We can express C u in basic syntax by replacing it with two constraint literals: 
Card(&, S) and not Card(u +1,5). The intuition is that C u is true if the lower 
bound is met but it is not the case that the number of satisfied literals is strictly 
greater than the upper bound. 



Conditional Literals in Rule Bodies. We need a more complex translation 
if we want to use a conditional literal X.L : A. An intuitive semantics for such 
literal is universal quantification; the non-ground conditional literal should be 
true if La is true when Aa is. Thus, we replace it by a negative constraint literal: 

not Card(l, {L : A}) . (18) 

This construct is analogous to the classical equality of Mx.p{x ) and -<3x-<p(x). 
However, in Section 5 we see that it differs from the classical case in one impor- 
tant sense. 




194 



T. Syrj anen 



Positive Cardinality Constraint Literals in Rule Heads. Thus far we 
have had only atoms in rule heads. However, many problems have natural rep- 
resentations where there are cardinality constraints in the heads so we want 
to allow them with the restriction that all literals that occur in the head are 
positive. A rule: 



Card(6, u, {X i .A 1 : D i, . . . , X n .A n : D n }) -s- body (19) 

is translated into n + 2 rules: 

{A'} e- D\. body 

not Card(6, (Ai.Ai : D\, . . . , X n .A n : D n }), body (20) 
<- Card(u + 1, {X 1 .A 1 : D u . . . ,X n .A n : D n }), body 

where A' and D\ are obtained by renaming all local variables that occur in them. 
The first rule allows us to include any atom A; in the model when the body is 
true while the next two rules ensure that the number of such atoms is between 
the bounds. In the first rule we effectively change local variables into global ones 
but this does not cause any problems since they stay local variables in the other 
two rules. If there is no upper bound, then the third rule is not necessary. Note 
also that if b = 0, then the second rule is trivially satisfied, so we obtain in that 
case behavior that is identical with choice rules. 

Example 5. The rule Card(l, 2, {X.a(X) : 6(A)}) c(A) is translated to 

{aoni^&on.cpo 

not Card(l, (A.a(A) : 6(A)}), c(A) 

<- Card(3, (A.a(A) : 6(A)}), c(A) . 

5 Discussion on the Basic and Extended Languages 

In this section we examine the motivations for the definitions in the previous 
two sections. The very first question is: why cardinality literals and no literals 
for other aggregate types? The immediate reason is that they form a simple and 
well-understood base so the semantics stays relatively clear and intuitive. This 
simple case can then be extended to more complex aggregates when necessary. 

Another, not as obvious reason is that cardinality constraints can be used 
everywhere where a basic literal can occur in normal programs, including in rule 
heads in the case of the full language. This means that we can use the full power 
of non-monotonic reasoning with them. A common approach for defining the 
stable model semantics for aggregates is to use them like negative literals in the 
reduct: if an aggregate is satisfied by the model candidate, it is removed, but if 
it is unsatisfied, the whole rule is discarded [7]. However, this makes it possible 
for an aggregate to justify itself in the model even when all basic literals in it 
are positive and ground. For example, if this approach is taken in the program: 



a <— Card(l, {a}) , 




Cardinality Constraint Programs 195 



then M = {a} is a valid stable model as M f= Card(l,{a}) even though the 
only justification for a is a itself. 

Note that our semantics allows a circular justification in case where an atom 
depends on itself via double negation, as in 

a <- not Card(l, {not a}) . 

This example shows that Card(l, {a}) and not Card(l, {not a}) are not equiva- 
lent even though they are indistinguishable in the classical sense. This situation 
is analogous to the normal programs where a <— a and a <— not noa\ noa <r- not a 
have different stable models. 

Positive cycles are also the reason why the basic language does not have 
conditional literals in rule heads. Consider the hypothetical rule R = {a : a] «— 
that encodes the silly condition that a may be true if a is true. Now, if we 
expanded the condition before taking the reduct, then {a} would be a model 
of R even though the only justification for a is that a is true. 

The combination of local variables and conditional literals is interesting in 
the sense that you can use them to implement existential quantification in 
rule bodies. In effect, a cardinality constraint Card(l, {X.a(X) : 6(X)}) en- 
codes an existential quantification 3x.(a(x) A b(x)). A straightforward way to 
encode such a quantification using only normal programs, would need a new 
atom to do it. Similarly, we get an universal quantification Vx.(6(x) —> a(x)) 
using not Card(l, {A. not a(X) : b(X)j). Encoding the universal quantification 
without conditional literals is tricky as you have to separate the cases where 
b(X) is never true, so the implication is trivially true, from the cases where b(X) 
is sometimes true. 



6 Omega-Restricted Programs 

If a logic program has function symbols in it, then its Herbrand instantiation is 
infinite and deciding whether it has a stable model or not becomes undecidable 
even when only Horn rules are used [3]. Also, even if the instantiation is finite, 
it may be exponential in size so constructing it may be intractable. Fortunately, 
in most cases most of the rules of the instantiation have trivially unsatisfiable 
bodies so we may leave them out without affecting the set of stable models. 

Next we identify a decidable subset of cardinality constraint programs, 
namely u> -restricted programs [10]. The basic idea is to enforce syntactic re- 
strictions on the programs so that it can be guaranteed that their all stable 
models are finite. The predicate symbols of a program are arranged into a strat- 
ification where more complex predicates are defined in terms of simpler ones. 
The stratification extends the usual definition of stratification [2] by adding a 
new level, the w-stratum, to contain the unstratifiable part of the program. 

A constraint literal is simple if it is of the form Card(l, {A : T}). Intuitively, 
a rule is cc-restricted if each variable that occurs in it occurs also in some positive 
simple constraint literal whose main predicate is on a strictly lower stratum than 
the head of the rule. This condition ensures that stable models stay finite. 




196 



T. Syrj anen 




Fig. 2. The Dependency graph of the Hamilton cycle program 



6.1 Dependency Graphs 

We start by defining a few helper notations. The set bbody + (-R) contains all 
positive basic literals and main literals of positive conditional literals that occur 
within positive cardinality constraint literals of the body of R. Thus, 

bbody + (i?) = [J {L \ L £ pos(C) or X.L : A £ pos(C)} , 

CGpos(R) 



while bbody ( R ) contains all other basic and main literals. Finally, simple(-R) 
denotes the set of simple constraint literals in the body of R. 

Definition 8. A dependency graph of a program P is a triple Dp = 
(V,E + ,E~) where V is the set of predicate symbols occurring in P, and 
E + ,E~ <ZV xV are the sets of positive and negative dependency edges, where 
(p, q) £ E + iff there exists a ride R € P where p occurs in the head and q occurs 
in bbody + (f?), and (p, q) £ E~ if at least one of the following conditions hold: 

1. p occurs in a head of a rule R £ P and q £ bbody - (R); 

2. there is a conditional literal in P where p occurs as the main literal and q 
as the condition; or 

3. p = q and p occurs in the head of a choice rule. 

A dependency path of a program P is a sequence (pi, . . . ,p n ) of predicate sym- 
bols such that (p%,Pi+\) £ E + U E~ for all 1 < i < n. A path is negative if for 
some i, (pi,p i+ 1 ) £ -E - . 

Intuitively, a predicate p depends on q if we cannot know the value of p 
without knowing the value of q. The definition of E + is straightforward, but the 
one for E~ is more involved. The first case of its definition correspond directly 
with the E + case. The second one is used to ensure that each time when we 
instantiate a conditional literal X.L : A, we already know the extension of A 
so we can leave out those instances of L that have unsatisfied conditions. The 
last case is an artificial definition that prevents us from mistakingly concluding 
that p has a fixed extension if one of its rules is a choice rule. 

Example 6. The dependency graph of the Hamilton cycle program is shown in 
Figure 2. The solid lines are positive arcs and the dashed lines are negative. 




Cardinality Constraint Programs 197 



Definition 9. A predicate p depends on a predicate q (denoted q -< p) in a 
program P iff there exists a dependency path (p, . . . ,q) in Dp. The dependence 
is negative (denoted q p) iff there exists a negative dependency path (p, . . . ,q) 
in Dp. 

Next, we define the cu-stratification. A predicate has to be on at least as high 
stratum as the predicates that it depends on positively, and it has to be on a 
higher stratum if the dependency is negative unless both predicates are on the 
w-stratum. 

Definition 10 . An w-stratificationo/ a program P is a function S : P{P) — > 
N U {cu} such that: 

1. Vp, q G V(P): if q<V, then S(p) > <S(g); 

2. Vp, q, G V(p): if q -<- p, then S(p) > 5(g) or S(p) = S(q) = to. 

An to -stratification is strict iff: 

1. if q -< p, p 7 ^ q and 5(p) < u >, then 5(p) > S(q); and 

2. if S(p) — lo, then there exists q G V(P) such that S(q) = u> and q -<_ p. 

In [11] it is proved that every cardinality constraint program has a strict strati- 
fication and that all strict stratifications of a program are essentially equivalent. 

Example 7. The Hamilton cycle program has the following strict w-stratification: 
S(vtx) = S(arc) = S(start) = 0, S(r) = S(hc) = u>. 

Definition 11. The w- valuation D of a rule R under an uj- stratification S is 
^(R^S) = S (pred(head(R))) and the valuation of a global variable V in a rule R 
is: 



f2(V,R,S) = min({5(pred(A)) | A : T G simple(-R)} U {w}) (21) 



Definition 12. A conditional literal X.L : A is cu-restricted under a stratifi- 
cation S iff S(L) > 5(A) and X C Var(A). A rule R is to -restricted if all 
conditional literals in it are lo -restricted, and for all V G Var(l?.), fHV, R,S) < 
f2(R,S). A program P is to-restricted if there exists a strict lo- stratification S 
such that all rules in P are 10 -restricted under S. 

Example 8. The rule {hc(X, Y)} arc(X,Y ) is cu-restrictecl under the stratifi- 

cation defined in Example 7 since f2(R,S) = to > f2(X,R,S ) = D(Y,R,S) — 0. 

Example 9. The rule a(f(X)) <— a(X) is not w-restricted since for each stratifi- 
cation S , D(R,S) = f2(X, R, S). 

If a predicate belongs to a finite stratum in one strict stratification, then it does 
so in all such stratifications [11]. We call those predicates domain predicates. 




198 



T. Syrj anen 



Theorem 1. The existence of a stable model of a finite u -restricted cardinality 
constraint program is decidable. 

Proof. (Sketch, details are shown in [11]) We can construct a strict u>-stratifica- 
tion of a program P by finding the strongly connected components of its depen- 
dency graph [11]. After that, we can show by induction that at each stratum 
starting from the first one the stable model induced by the rules on that stratum 
is finite. At the first stratum all rules are ground so the models are trivially fi- 
nite. In the following strata each variable that occurs in a rule occurs in a simple 
literal that belongs to an earlier stratum so each rule has only a finite number 
of instances with satisfiable bodies. As the dependency graph contains a finite 
number of nodes, the number of different non-empty strata is also finite. 

7 Adding Interpreted Function Symbols 

A practical ASP system has to have direct support for interpreted function 
symbols such as arithmetical operators. Otherwise, encoding problems involving 
arithmetics becomes cumbersome very quickly. In this work we take the approach 
that we add an interpretation function T that canonizes Herbrand terms. For 
example, with arithmetic addition we want to have Z(+(5, 2)) = 7 where +(5, 2) 
is a Herbrand term formed from function symbols +, 5, and 2. 

Definition 13. Let P be a cardinality constraint program. Then, an evaluator 
is a function X : HU(P) HU(P). 

We also use 1(F) to denote the formula that is obtained from F by evaluating all 
terms in it. For example, X(a(t , , . . . ,t n )) = a(X(t \ ), . . . ,X(t n )). Next, we alter 
the definition of the instantiation of a rule so that all terms in it get evaluated. 

Definition 14. The instantiation of a rule R = A C \, . . . , C n is the set of 
rules: 



HI (R, P ) = {X(Aa) <- 1(1(0,, P)a), . . . ,X(I(C n , P)a) \ 

a G Sub(Var(P),HU(P))} 1 ’ 

In practice we can often save computational effort by evaluating a function 
as soon as we know its arguments. 

We do not impose any requirements to the evaluation function E in this work. 
The most practical way to construct it is to have the logic programming tool to 
provide a number of built-in functions that implement the common arithmetic 
operations, and then suppose that any other function symbol has the Herbrand 
interpretation E(x) = x for all x G HU. In the case that the user tries to use the 
built-ins in an undefined way, such as writing the term 2 + f(a), we have three 
possible ways to handle it: 1) we may revert back to the Herbrand interpretation; 
2) add an explicit error term e and return that as an answer; or 3) add some 
high-level type checking into the tool and reject the whole program as erroneous 
whenever undefined operations occur. 




Cardinality Constraint Programs 199 



8 Conclusions 

We defined the class of cardinality constraint programs that allows the use of 
cardinality constraint and conditional literals. The language is a superset of nor- 
mal logic programs. These constructs make it possible to express universal and 
existential quantifications in rule bodies. We also identified a decidable subset 
of it, w-restricted programs, and showed how to formalize built-in functions. 

A direct line of further research is to identify other types of aggregate literals 
that behave in a similar way to cardinality constraints, that is, aggregates that 
can be used as first-class literals in rules. 

References 

1. Christian Anger, Kathrin Konczak, and Thomas Linke. Nomore : A system for non- 
monotonic reasoning under answer set semantics. In Proceedings of LPNMR’01, 
pages 406-410, September 2001. 

2. A. Chandra and D. Harel. Horn clause queries and generalizations. Journal of 
Logic Programming , 1:1—15, 1985. 

3. Evgeny Dantsin, Thomas Eiter, Georg Gottlob, and Andrei Voronkov. Complexity 
and expressive power of logic programming. ACM Comput. Surv., 33(3), 374-425, 
2001 . 

4. Tina DeU’Armi, Wolfgang Faber, Giuseppe Ielpa, Christoph Koch, Nicola Leone, 
Simona Perri, and Gerald Pfeifer. System description: Dlv. In Proceedings of 
LPNMR’01, Vienna, Austria, September 2001. Springer- Verlag. 

5. D. East and M Truszczynski. Propositional satisfiability in answer-set program- 
ming. In Proceedings of KI 2001, pages 138-153, 2001. 

6. M. Gelfond and V. Lifschitz. The stable model semantics for logic programming. 
In Proceedings of ICLP’88, pages 1070-1080. The MIT Press, August 1988. 

7. David B. Kemp and Peter J. Stuckey. Semantics of logic programs with aggregates. 
In Proceedings of ILP’91, pages 387-401. MIT Press, 1991. 

8. Ilkka Niemela and Patrik Simons. Extending the Smodels system with cardinality 
and weight constraints. In Jack Minker, editor, Logic-Based Artificial Intelligence, 
pages 491 521. Kluwer Academic Publishers, 2000. 

9. Patrik Simons, Ilkka Niemela, and Timo Soininen. Extending and implementing 
the stable model semantics. Artificial Intelligence, 138(l-2):181-234, 2002. 

10. Tommi Syrjanen. Omega-restricted logic programs. In Proceedings of LPNMR’01, 
Vienna, Austria, September 2001. Springer- Verlag. 

11. Tommi Syrjanen. Logic programming with cardinality constraints. Research Re- 
port A 86, Helsinki University of Technology, Laboratory for Theoretical Computer 
Science, Helsinki, Finland, December 2003. 

12. Tommi Syrjanen and Ilkka Niemela. The Smodels system. In Proceedings of LP- 
NMR’01, Vienna, Austria, September 2001. Springer- Verlag. 




Recursive Aggregates in Disjunctive Logic Programs: 
Semantics and Complexity* 



Wolfgang Faber 1 , Nicola Leone 2 , and Gerald Pfeifer 1 

1 Institut fur Informationssysteme, TU Wien, A- 1040 Wien, Austria 
f aberOkr . tuwien. ac . at , gerald@pf eif er . com 
2 Department of Mathematics, University of Calabria, 1-87030 Rende (CS), Italy 

leone@unical . it 



Abstract. The addition of aggregates has been one of the most relevant enhance- 
ments to the language of answer set programming (ASP). They strengthen the 
modeling power of ASP, in terms of concise problem representations. While many 
important problems can be encoded using nonrecursive aggregates, some relevant 
examples lend themselves for the use of recursive aggregates. Previous semantic 
definitions typically agree in the nonrecursive case, but the picture is less clear 
for recursion. Some proposals explicitly avoid recursive aggregates, most others 
differ, and many of them do not satisfy desirable criteria, such as minimality or 
coincidence with answer sets in the aggregate-free case. 

In this paper we define a semantics for disjunctive programs with arbitrary ag- 
gregates (including monotone, antimonotone, and nonmonotone aggregates). This 
semantics is a fully declarative, genuine generalization of the answer set semantics 
for disjunctive logic programming (DLP). It is defined by a natural variant of the 
Gelfond-Lifschitz transformation, and treats aggregate and non-aggregate literals 
in a uniform way. We prove that our semantics guarantees the minimality (and 
therefore the incomparability) of answer sets, and demonstrate that it coincides 
with the standard answer set semantics on aggregate-free programs. Finally we 
analyze the computational complexity of this language, paying particular attention 
to the impact of syntactical restrictions on programs. 



1 Introduction 

Aggregates significantly enhance the language of answer set programming (ASP), allow- 
ing for natural and concise modeling of many problems. Nonrecursive (also called strat- 
ified) aggregates have clear semantics and capture a large class of meaningful problem 
specifications. However, there are relevant problems for which recursive (unstratified) 
aggregate formulations are natural; the Company Control problem, illustrated next, is a 
typical example, cf. [1,2, 3, 4]. 

Example 1. We are given a set of facts for predicate company(X), denoting the com- 
panies involved, and a set of facts for predicate ownsStk(C 1, C 2, Perc ), denoting the 
percentage of shares of company C 2, which is owned by company Cl. Then, company 

* This work was supported by the European Commission under projects IST-2002-33570 INFO- 
MIX, IST-200 1-37004 WASP, and IST-2001-33570 COLOGNET. 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, pp. 200-212, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




Recursive Aggregates in Disjunctive Logic Programs: Semantics and Complexity 201 



Cl controls company C 2 if the sum of the shares of C 2 owned either directly by Cl or 
by companies, which are controlled by Cl, is more than 50%. This problem has been 
encoded as the following program V c tri by many authors in the literature [1, 2,3,4].' 

controls Stk (Cl, Cl, C2, P):— ownsStk(Cl, C2, P). 

controlsStk(C 1, C2, C3, P):— company (Cl) , controls(Cl , C2), ownsStk(C2 , C3, P ). 
controls (Cl, C3):— company (Cl) , company(C 3), 

#sum{P, C2 : controls Stk (Cl, C2, C3, P)} > 50. 

Intuitively, controls Stk (Cl, C 2, C 3, P) denotes that company Cl controls P% of C3 
shares “through” company C2 (as Cl controls C2, and C2 owns P% of C3 shares). 
Predicate controls(C 1, C2) encodes that company Cl controls company C 2. For two 
companies, say, cl and c3, controls (cl, c3) is derived if the sum of the elements in the 
multiset {P | 3C2 : controls Stk (cl, C 2, c3, P)} is greater than 50. Note that in the 
DLV syntax this multiset is expressed by {P, C 2 : controls Stk (cl, C 2, c3, P)} where 
the variable C 2 avoids that duplicate occurrences of P are eliminated. 

The encoding of Company Control contains a recursive aggregate (since predicate 
controlsStk in the aggregate depends on the head predicate controls ). Unfortunately, 
however, recursive aggregates are not easy to handle, and their semantics is not always 
straightforward. 

Example 2. Consider the following two programs: 

Pi : {p(a)\— #count]LY : p(X)} > 0.} P 2 : {p(a):— #count{X : p( A')} < 1.} 

In both cases p(a) is the only atom for p which might be true, so, intuitively, one 
may expect that #count{Jf : p(X )} > 0 is true iff p(a) is true; while #count{26 : 
p(X ) } < 1 should be true iff p(a) is false. Thus, the above programs should, respectively, 
behave like the following standard programs: 

P[ : (p(a):-p(a).} P' 2 : {p(o):-not p(a).} 

This is not always the case in the literature, and there is a debate on the best semantics 
for recursive aggregates. 

There have been several attempts for defining a suitable semantics for aggregates [2,6, 
7,4,8]. However, while previous semantic definitions typically agree in the nonrecursive 
case, the picture is not so clear for recursion. Some proposals explicitly avoid recursive 
aggregates, most others differ, and many of them do not satisfy desirable criteria, such 
as minimality 2 . Relevant progress towards a suitable semantics for recursive aggregates 
has been recently made in [4,8], where the authors provide a semantics which guaran- 
tees minimality and extends standard answer sets. However, both definitions are given 
operationally and do not cover all language fragments. The first proposal disregards 
disjunctive programs, while the latter covers only monotone aggregates. 

1 Throughout this paper, we adopt the concrete syntax of the DLV language [5] to express 
aggregates in the examples. 

2 The subset-minimality of answer sets, which holds in the aggregate-free case and for the main 
nonmonotonic logics [9], also guarantees that answer sets are incomparable, and allows to 
define the transitive closure - which becomes impossible if minimality is lost [4], 




202 



W. Faber, N. Leone, and G. Pfeifer 



In this paper, we make a step forward and provide a fully declarative semantics which 
works also for disjunctive programs and arbitrary aggregates. The main contributions of 
the paper are the following: 

- We provide a definition of the answer sets semantics for disjunctive programs with 
arbitrary aggregates (including monotone aggregates, antimonotone aggregates, and 
aggregates which are neither monotone nor antimonotone). This semantics is fully 
declarative and is given in the standard way for answer sets, by a generalization of 
the well-known Gelfond-Lifschitz transformation. 

- We study the properties of the proposed semantics, and show the following results: 

• Our answer sets are subset-minimal models, and therefore they are incomparable 
to each other, which is generally seen as an important property of nonmonotonic 
semantics [10,4]. 

• For aggregate-free programs, our semantics coincides with the standard answer 
set semantics. 

• From a semantic viewpoint, monotone aggregate literals correspond to positive 
standard literals, while antimonotone aggregates correspond to negative stan- 
dard literals. We provide a rewriting from standard logic programs with negation 
to positive programs with antimonotone aggregate atoms. 

- We carry out an in-depth analysis of the computational complexity of disjunctive 
programs with aggregates and fragments thereof. As long as the values of aggregates 
are computable in polynomial time, their addition does not increase the complex- 
ity of the full DLP language. However, the complexity of some fragments of DLP 
is affected by aggregates. Interestingly, monotone aggregates never alter the com- 
plexity, while antimonotone aggregates cause a complexity gap in many cases (see 
Section 4); arbitrary aggregates behave precisely like antimonotone aggregates from 
the complexity viewpoint in the studied cases. 



2 The DLP- 4 Language 

In this section, we provide a formal definition of the syntax and semantics of the DLP y ' 
language - an extension of Disjunctive Logic Programming (DLP) by set-oriented func- 
tions (also called aggregate functions). We assume that the reader is familiar with standard 
DLP; we refer to atoms, literals, rules, and programs of DLP, as standard atoms, standard 
literals, standard rules, and standard programs, respectively. For further background, 
see [11,12], 

2.1 Syntax 

Set Terms. A (DLP- 4 ) set term is either a symbolic set or a ground set. A symbolic set 
is a pair { Vars : Conj}, where Vars is a list of variables and Conj is a conjunction of 
standard literals. 3 A ground set is a set of pairs of the form (t : Conj), where t is a list 
of constants and Conj is a ground (variable free) conjunction of standard literals. 

3 Intuitively, a symbolic set (A' :a(X, L),not p(Y )} stands for the set of X-values making 
a(X, Y”),not p(Y ) true, i.e., {.Y|3Y"s.f. a(X, Y), not p(Y) is true}. 




Recursive Aggregates in Disjunctive Logic Programs: Semantics and Complexity 203 



Aggregate Functions. An aggregate function is of the form f(S), where S' is a set 
term, and / is an aggregate function symbol. Intuitively, an aggregate function can be 
thought of as a (possibly partial) function mapping multisets 4 of constants to a constant. 

Example 3. The aggregate functions currently supported by the DLV system are: #min 
(minimal term, undefined for empty set), #max (maximal term, undefined for empty 
set), ^count (number of terms), #sum (sum of non-negative integers), and #times 
(product of positive integers). 

Aggregate Literals. An aggregate atom is f(S) -< T, where /(S) is an aggregate 
function, -<£ {=, <, <, >, >} is a predefined comparison operator, and T is a term 
(variable or constant) referred to as guard. 

Example 4. The following aggregate atoms in DLV notation, where the latter contains 
a ground set and could be a ground instance of the former: 

#ma x{Z : r(Z), not a(Z, V)} > Y 

#max{(2 : r(2),not a(2,x)), (2 : r(2),not a(2, y))} > 1 

An atom is either a standard (DLP) atom or an aggregate atom. A literal L is an atom A 
or an atom A preceded by the default negation symbol not; if A is an aggregate atom, 
L is an aggregate literal. 

DLP- 4 Programs. A (DLP A ) rule r is a construct 

«i V • • • V o ra 6i, • • * , bk, not bk+i,- ■ • , not b m . 

where Oi , • • • , a n are standard atoms, b\ , • • • , b m are atoms, and n > 0, m > k > 0, 
n + m > 0. The disjunction ai V • • • V a n is referred to as the head of r, while the 
conjunction 6i, ..., 6^, not &fc + i,...,not b m is the body of r. A (DLP-^j program is a 
set of DLP- 4 rules. 



Syntactic Properties. A global variable of a rule r is a variable appearing in a standard 
atom of r; all other variables are local variables. 

Safety. A rule r is safe if the following conditions hold: (i) each global variable of 
r appears in a positive standard literal in the body of r; (ii) each local variable of r 
appearing in a symbolic set { Vars : Conj} appears in a positive literal in Conj\ (iii) 
each guard of an aggregate atom of r is a constant or a global variable. A program V is 
safe if all r £ V are safe. In the following we assume that DLP yl programs are safe. 

Example 5. Consider the following rules with DLV aggregates: 

p(X):-q(X,Y, V),#max{Z : r(Z), not a(Z, V)} > V. 
p[xy.-q{X , Y, V), #sum {Z : not a(Z, 5)} > Y. 
p(Xy.-q{X,Y,V),#min{Z : r(Z), not a(Z, V)} > T. 

The first rule is safe, while the second is not, since both local variables Z and S violate 
condition (ii). The third rule is not safe either, since the guard T violates condition (iii). 

4 Note that aggregate functions are evaluated on the valuation of a (ground) set w.r.t. an interpre- 
tation, which is a multiset, cf. Section 2.2. 




204 



W. Faber, N. Leone, and G. Pfeifer 



Stratification. A DLP- 4 program V is aggregate-stratified if there exists a function 
|||, called level mapping, from the set of (standard) predicates of V to ordinals, such 
that for each pair a and b of standard predicates, occurring in the head and body of a rule 
r £ V, respectively: (i) if b appears in an aggregate atom, then ||6|| < ||o||, and (ii) if b 
occurs in a standard atom, then 1 16| | < | |a| | . 

Example 6. Consider the program consisting of a set of facts for predicates a and b, plus 
the following two rules: 

q(X):-p(X), #count{Y : a(Y,X),b(X)} < 2. p(X):-q(X),b(X). 

The program is aggregate-stratified, as the level mapping ||a|| = ||b|| = 1, ||p|| = 

| |g| | = 2 satisfies the required conditions. If we add the rule b(X):—p(X), then no such 
level-mapping exists and the program becomes aggregate-unstratified. 

Intuitively, aggregate-stratification forbids recursion through aggregates. While the 
semantics of aggregate-stratified programs is more or less agreed upon, different and 
disagreeing semantics for aggregate-unstratified programs have been defined in the past, 
cf. [4], In the following we shall provide a novel characterization which directly extends 
well-known formulations of semantics for aggregate-free programs. 

2.2 Semantics 

Universe and Base. Given a DLP yl program V , let Up denote the set of constants 
appearing in V, and Bp the set of standard atoms constructible from the (standard) 

x 

predicates of V with constants in Up . Given a set X, let 2 denote the set of all multisets 
over elements from X. Without loss of generality, we assume that aggregate functions 
map to I (the set of integers). 

Example 7. Let us now describe the domains of the aggregate functions in DLV (where 
N and N + denote the set of non-negative integers and positive integers, respectively): 

^count is defined over 2 U ^ #sum over 2 N , #times over 2 W , 5 #min and jfmax are 
defined over 2 N — {0}. 

Instantiation. A substitution is a mapping from a set of variables to Up. A substi- 
tution from the set of global variables of a rule r (to Up) is a global substitution for 
r; a substitution from the set of local variables of a symbolic set S (to Up) is a local 
substitution for S. Given a symbolic set without global variables S = {Vars : Conj}, 
the instantiation of S is the following ground set of pairs inst(S): 

{(7( Vars ) : 7 {Conj)) | 7 is a local substitution for S'}. 6 

A ground instance of a rule r is obtained in two steps: (1) a global substitution o for r 
is first applied over r; (2) every symbolic set S in cr(r) is replaced by its instantiation 
inst(S) . The instantiation GroundfP ) of a program V is the set of all possible instances 
of the rules of V. 

5 #sum and ^times applied over an empty set return 0 and 1, respectively. 

6 Given a substitution 0 and a DTP* 4 object Obj (rule, set, etc.), we denote by o(Obj) the object 
obtained by replacing each variable X in Obj by <r(.Y). 




Recursive Aggregates in Disjunctive Logic Programs: Semantics and Complexity 205 



Example 8. Consider the following program V\ : 

9(1) V p(2, 2). 9(2) V p(2, 1). t(X):-q(X),# S rm{Y : p(X,Y)} > 1. 

The instantiation Ground(V 1 ) is the following: 

9(1) V p(2, 2). t(l):-«(l),#8inn{(l:p(l,l)),(2:p(l,2))} > 1. 

9(2) V p(2, 1). t(2):—q(2), #sum{(l : p(2, 1)), (2 : p(2, 2))} > 1. 

Interpretation. An interpretation for a DLP" 4 program 'Pisa set of standard ground 
atoms / C B'p. The truth valuation 1(A), where A is a standard ground literal or a 
standard ground conjunction, is defined in the usual way. An interpretation also pro- 
vides a meaning to (ground) sets, aggregate functions and aggregate literals, namely a 
multiset, a value, and a truth value, respectively. Let f(S) be a an aggregate function. 
The valuation I(S) of S w.r.t. I is the multiset of the first constant of the elements 
in S whose conjunction is true w.r.t. I. More precisely, let I ( S ) denote the multiset 
[fi | (ti, ..., t n : Conj) £ S A Conj is true w.r.t. I } The valuation /(/(S’)) of an aggregate 
function f(S) w.r.t. I is the result of the application of / on I(S). If the multiset I(S) 
is not in the domain of /, I(f(S)) = _L (where _L is a fixed symbol not occurring in V). 

An instantiated aggregate atom A = f(S) -< k is true w.r.t. I if: (i) I(f(S)) ^ _L, 
and, (ii) I(f(S)) -< k holds; otherwise, A is false. An instantiated aggregate literal 
not A = not f(S) -< k is true w.r.t. I if (i) I(f(S)) _L, and, (ii) I(f(S)) -< k does 

not hold; otherwise, A is false. A rule r is satisfied w.r.t. I if some head atom is true w.r.t. 
I whenever all body literals are true w.r.t. I. 

Example 9. Consider the atom A = #sum{(l : p( 2, 1)), (2 : p(2, 2))} > 1 from Ex- 
ample 8. Let S be the ground set in A. For the interpretation I = {q(2),p(2, 2), t (2) }, 
I(S) = [2], the application of #sum over [2] yields 2, and A is therefore true w.r.t. /, 
since 2 > 1 . 1 is a model of the program of Example 8. 

Definition 1. A ground literal £ is monotone, if for all interpretations /, J, such that 
I C J, £ is true w.r.t. I implies that £ is true w.r.t. J. A ground literal £ is antimonotone, 
if for all interpretations /, J, such that I C J, £ is true w.r.t. J implies that £ is true w.r.t. 
I. A ground literal £ is nonmonotone, if it is neither monotone nor antimonotone. 

Note that positive standard literals are monotone, whereas negative standard literals 
are antimonotone. Aggregate literals may be monotone, antimonotone or nonmonotone, 
regardless whether they are positive or negative. 

Example 10. All ground instances of the following aggregate literals are monotone 

$xount{Z : r(Z)} > 1 not l£coxnit{Z : r(Z)} < 1 

while the following are antimonotone: 

T^count {Z : r(Z)} < 1 not ^tcount{^ : r(Z)} > 1 

Nonmonotone literals include the sum over (possibly negative) integers and the average. 
Also, most monotone or antimonotone functions combined with the equality operator 
yield nonmonotone literals. 




206 



W. Faber, N. Leone, and G. Pfeifer 



2.3 Answer Sets 

We will next define the notion of answer sets for DLP-^ programs. While usually this is 
done by first defining the notion of answer sets for positive programs (coinciding with 
the minimal model semantics) and then for negative programs by a stability condition 
on a reduct, once aggregates have to be considered, the notions of positive and negative 
literals are in general not clear. If only monotone and antimonotone aggregate atoms 
were considered, one could simply treat monotone literals like positive literals and an- 
timonotone literals like negative ones, and follow the standard approach, as hinted at 
in [4], Since we also consider nonmonotone aggregates, such a categorization is not 
feasible, and we rely on a definition which always employs a stability condition on a 
reduct. 

The subsequent definitions are directly based on models: An interpretation M is a 
model of a DLP-^ program V if all r £ GroundfP ) are satisfied w.r.t. M. An interpre- 
tation M is a subset-minimal model of V if no I C M is a model of GroundifP). 

Next we provide the transformation by which the reduct of a ground program w.r.t. 
an interpretation is formed. Note that this definition is a generalization of the Gelfond- 
Lifschitz transformation for DLP programs (see Theorem 3). 

Definition 2. Given a ground DLP^ program V and an interpretation I, let V 1 denote 
the transformed program obtained from V by deleting rules in which a body literal is 
false w. r. t. I. 

Example 11. Consider Example 2: Ground(Pi) = {p(a):— #count{(a : p{a))} > 
0.} and Ground{P 2 ) = {p(a):~ #count{(a : p(a))} < 1-}, and interpretation 
/i = {p(a)}, I 2 = 0 - Then, Ground(Pi) 11 = Ground(Pi), Ground(P±) 12 = 0 , 
and Ground{P 2 ) 11 = 0 , Ground(P 2 ) 12 = Ground(P 2 ) hold. 

We are now ready to formulate the stability criterion for answer sets. 

Definition 3 (Answer Sets for DLP-^ Programs). Given a DLP- 4 program V. an 
interpretation A of Ground(V) is an answer set if it is a subset-minimal model of 
Ground(V) A . 

Note that any answer set A of V is also a model of V because Ground{ V) A C 
GroundifP ), and rules in GroundfP ) — Ground(V) A are satisfied w.r.t. A. 

Example 12. For the programs of Example 2, I 2 of Example 1 1 is the only answer 
set of Pi (because I\ is not a minimal model of Ground(Pi) 11 ), while P 2 admits no 
answer set ( Ii is not a minimal model of Groimd(P 2 ) 11 , and I 2 is not a model of 
Ground(P 2 ) = Ground{P 2 ) 12 . 

For Example 1 and the following input facts 

company {a), company (b). company (c). 

ownsStk(a,b,40 ). ownsStk(c,b, 20). ownsStk(a, c, 40) . ownsStk(b,c, 20). 

only the set A = { controlsStk(a , a, b, 40), controlsStk{a, a, c, 40), controls Stk(b, b, c, 20), 
controlsStk(c,c,b, 20)} (omitting facts) is an answer set, which means that no com- 
pany controls another company. Note that A\ = A U {controls (a, b), controls (a, c), 
controlsStk(a, b , c, 20), controls Stk (a, c, b , 20)} is not an answer set, which is reasonable, 
since there is no basis for the truth of literals in Ai — A. 




Recursive Aggregates in Disjunctive Logic Programs: Semantics and Complexity 207 



This definition is a generalization and simplification of the definitions given in [13, 
10]. In particular, different to [10], we define answer sets directly on top of the notion 
of models of DLP' 4 programs, rather than transforming them to a positive program. 

3 Semantic Properties 

A generally desirable and important property of nonmonotonic semantics is minimality 
[10,4], in particular a semantics should refine the notion of minimal models. We now 
show that our semantics has this property. 

Theorem 1. Answer Sets of a DLP' 4 program V are subset-minimal models ofV. 

Proof. Our proof is by contradiction: Assume that i\ is a model of V, I 2 is an answer 
set of V and that 1\ C / 2 . 7 Since / 2 is an answer set of V, it is a subset-minimal 
model of Ground( V) 12 by Definition 3. Therefore, Ii is not a model of GroundifP) 12 
(otherwise, / 2 would not be a subset-minimal model of GroundifP) 12 ). Thus, some 
rule r £ GroundifP) 12 is not satisfied w.r.t. I\. Since GroundifP Y 2 C GroundifP), 
r is also in GroundifP) and therefore 1\ cannot be a model of V, contradicting the 
assumption. 

Corollary 1. Answer sets of a DLP' 4 program V are incomparable (w.r.t. set inclusion) 
among each other. 

Theorem 1 can be refined for DLP' 4 programs containing only monotone literals. 

Theorem 2. The answer sets of a DLP -4 program V, where V contains only monotone 
literals, are precisely the minimal models ofV. 

Proof. Let V be a DLP y ' program containing only monotone literals, and I be a minimal 
model of V. Clearly, / is also a model of V 1 . We again proceed by contradiction and 
show that no J C I is a model of V 1 : Assume that such a model J of P exists and 
satisfies all rules in Ground(TY . All rules in Ground( V) — Ground (V) 1 are satisfied 
by I because their body is false w.r.t. /. But since V contains only monotone literals, 
each false literal in I is also false in J C /, and hence J also satisfies all rules in 
Ground( V) — Ground(VY and would therefore be a model of V , contradicting the 
assumption that I is a minimal model. Together with Theorem 1, the result follows. 

Clearly, a very desirable feature of a semantics for an extended language is that 
it properly extends agreed-upon semantics of the base language, so that the semantics 
are equal on the base language. Therefore we next show that for DLP programs, our 
semantics coincides with the standard answer set semantics. Note that not all semantics 
which have been proposed for programs with aggregates meet this requirement, cf. [4]. 

Theorem 3. Given a DLP program V, an interpretation I is an answer set ofV accord- 
ing to Definition 3 iff it is an answer set ofV according to the standard definition via 
the classic Gelfond-Lifschitz transformation [11]. 

7 Throughout the paper, C denotes strict set inclusion. 




208 



W. Faber, N. Leone, and G. Pfeifer 



Proof. (=>) : Assume that I is an answer set w.r.t. Definition 3, i.e. I is a minimal model 
of GroundfP) 1 . Let us denote the standard Gelfond-Lifschitz transformed program by 
GL(Ground(V) , I). For each r G Ground{VY some r' G GL(Ground(V) , I) ex- 
ists, which is obtained from r by removing all negative literals. Since r G Ground^ V) 1 , 
all negative literals of r are true in I, and also in all J C I. For rules of which an 
r" G GL(Ground{ V), I) exists but no corresponding rule in Ground^ V) 1 , some pos- 
itive body literal of r" is false w.r.t. I (hence r" is not included in GroundiV) 1 ), and 
also false w.r.t. all J C I. Therefore (i) I is a model of GL(Ground(V) , I) and (ii) no 
J C I is a model of GL(Ground{V ), /), as it would also be a model of Ground^ V) 1 
and I thus would not be a minimal model of Ground^ V) 1 . Hence I is a minimal model 
of GL(Ground{V) , I) whenever it is a minimal model of Ground^ V) 1 . 

(*f=): Now assume that I is a standard answer set of V, that is, I is a minimal model 
of GL(Ground(V ), I). By similar reasoning as in (=>) a rule r G GL{Ground(V),I) 
with true body w.r.t. I has a corresponding rule r' G Ground (V) 1 which contains the 
negative body of the original rule r° G Ground(V), which is true w.r.t. all J C I. Any 
rule r" G GL(Ground{V ) , I) with false body w.r.t. I is not contained in Ground^ V) 1 , 
but it is satisfied in each J C I. Therefore (i) I is a model of Ground( V) 1 and 
(ii) no J C I is a model of Ground (V) 1 (otherwise J would also be a model of 
GL{Ground{V ) , /)). As a consequence, I is a minimal model of Ground( V) 1 when- 
ever it is a minimal model of GL(Groimd(V ) , I ) . 

4 Computational Complexity 

4.1 Complexity Framework 

We analyze the complexity of DLP yl on Cautious Reasoning, a main reasoning task 
in nonmonotonic formalisms, amounting to the following decisional problem: Given a 
DLP yl program V and a standard ground atom A, is A true in all answer sets of V? 

We consider propositional (i.e., variable-free) DLP-^ programs, and polynomial-time 
computable aggregate functions (note that all sample aggregate functions appearing in 
this paper fall into this class). 

4.2 Overview of Complexity Results 

Table 1 summarizes the complexity results derived in the next sections. The rows spec- 
ify the allowance of negation (not); the columns specify the allowance of aggregates, 
namely: M s = stratified monotone aggregates, M = full (possibly recursive) monotone 
aggregates, A s = stratified antimonotone aggregates, A = full antimonotone aggregates, 
N s = stratified nonmonotone aggregates, and N = full nonmonotone aggregates. 

The good news is that the addition of aggregates does not increase the complexity 
of disjunctive logic programming. Cautious reasoning on the full DLP yl language, in- 
cluding all considered types of aggregates (monotone, antimonotone, and nonmonotone) 
even unstratified, remains 77 ^ -complete, as for standard DLP. 

The most “benign” aggregates, from the complexity viewpoint, are the monotone 
ones, whose addition does never cause any complexity increase, even for negation-free 
programs, and even for unstratified monotone aggregates. 




Recursive Aggregates in Disjunctive Logic Programs: Semantics and Complexity 209 



Table 1 . The Complexity of Cautious Reasoning on Disjunctive Programs with Aggregates 



0 


{Ms} 


{M} 


{A s } {N s } {M s , As, Ns} 


Ml 


{A} {M, A, N} 


negation-free co-NP 


co-NP 


co-NP 


rtf nf 


nf 


nf 


n 2 P 


n? 


with negation Tiff 


n[ 


nCf 


nf zrf 


nf 


ni 


nZ 





On negation-free programs, the addition of either antimonotone or nonmonotone 
aggregates increases the complexity, jumping from co-NP to 77<f . In all other cases, the 
complexity remains the same as for standard programs. 

4.3 Proofs of Hardness Results 

An important observation is that negation can be rewritten to an antimonotone aggregate. 
It is therefore possible to turn aggregate-free programs with negation into corresponding 
positive programs with aggregates. 

Definition 4. Given an ( aggregate-free ) DLP program V, let I ' CP) be the DLP-^ pro- 
gram, which is obtained by replacing each negative literal not a in V by ^count{(e : 
a)} < 1, where e is an arbitrary constant. 

Theorem 4. Each aggregate-free DLP program V can be transformed into an equivalent 
positive DLP-^ program r{V) with aggregate literals (all of which are antimonotone). 
IfV is stratified w.r.t. negation, then rfP) is aggregate-stratified (i.e., all aggregates in 
r(fP) are nonrecursive). 

Proof. Note that for any interpretation /, not a is true w.r.t. I iff #count{(e : a)} < 1 
is true w.r.t. I, and that ^count{(e : a)} < 1 is an antimonotone aggregate literal. By 
virtue of Theorem 3, our answer sets semantics (as in Definition 3) is equivalent to the 
standard answer set semantics. Thus, since the valuation of literals is equal in V and 
r(V ), both programs have the same answer sets. 

Since aggregates take the place of negative literals, if the latter are nonrecursive in 
V (i.e., V is stratified w.r.t. negation), the former are nonrecursive as well (i.e., I' CP) is 
aggregate-stratified). 

Theorem 5. Let V be a DLP program. Then (i) P(P) has the same size (i.e., number 
of rules and literals) as V, and (ii) T( V) is LOGSPACE computable from V. 

Proof. The r(fP) transformation replaces each negative literal by an aggregate atom; 
and it does not add any further literal to the program. Therefore it does not increase 
the program size. It is easy to see that l'(V) can be computed by a LOGSPACE Turing 
Machine. Indeed, r(fP) can be generated by dealing with one rule of P at a time, without 
storing any intermediate data apart from a fixed number of indices. 

Finally, we state the relation between antimonotone and nonmonotone literals. 

Theorem 6. Each DLP yl program, whose aggregates are all antimonotone, can be 
transformed into an equivalent program, whose aggregates are all nonmonotone. 




210 



W. Faber, N. Leone, and G. Pfeifer 



Proof. W.l.o.g. we will consider a ground program V. We transform each antimonotone 
aggregate literal l containing the aggregate atom f(S) -< k to V containing f l (S') -< k. 
We introduce three fresh constants r, e, and v and a new predicate symbol 77. Let f 1 
be undefined for the multisets [r] and [r, e, v] and return a value making l true for [r, e] 
(such a value does always exist); otherwise f l is equal to /. Furthermore, S' is obtained 
by adding (r : 77(r)), (e : 77(e)), and (y : n(y )) to the ground set S. The transformed 
program V' contains only nonmonotone aggregates and is equivalent to V . 

Theorem 7. Each field of Table 1 states the proper hardness of the corresponding frag- 
ment of DLP a . 

Proof The hardness results for all fields in the second row of Table 1 stem from the 
77-f -hardness of disjunctive programs with negation [14], 8 The same result, together 
with Theorems 4 and 5, entails Ilf -hardness of all the DLP A fragments admitting 
antimonotone aggregates. Ilf -hardness of all the DLP- 4 fragments with nonmonotone 
aggregates then follows from Theorem 6. Finally, the results in the first three entries in 
the first row stem from the co-NP-hardness of positive disjunctive programs [14J. 

4.4 Proofs of Membership Results 

In the membership proofs, we will implicitly use the following lemma: 

Lemma 1. Given an interpretation I for a DLP yl program V , the truth valuation of an 
aggregate atom L is computable in polynomial time. 

Proof. Let L = f(T) -< k. To determine the truth valuation of L, we have to: (i) compute 
the valuation 7 (T) of the ground set T w.r.t. 7, (ii) apply the aggregate function / on 
7(T), and (iii) compare the result of /(7(T)) with k w.r.t. -<:. 

Computing the valuation of a ground set T only requires scanning each element 
: Conj) of T, adding t.\ to the result multiset if Conj is true w.r.t. 7. This 
is evidently polynomial, as is the application of the aggregate function on 7(T) in our 
framework (see Section 4.1). The comparison with k, finally, is straightforward. 

Lemma 2. Let V be a negation-free DLP yl program, whose aggregates are all mono- 
tone. A standard ground atom A is not a cautious consequence ofV, if and only if there 
exists a model M ofV which does not contain A. 9 

Proof. Observe first that, since V does not contain negation and only monotone aggregate 
literals, each literal appearing in V is monotone. 

(<t=): The existence of a model M of V not containing A, implies the existence of 
a minimal model M' of V (with M' C M ) not containing A. By virtue of Theorem 2, 
M' is an answer set of V . Therefore, A is not a cautious consequence of V . 

(=>) : Since A is not a cautious consequence of V , by definition of cautious reasoning, 
there exists an answer set M of V which does not contain A. By definition of answer 
sets, M is also a model of V, as remarked after Definition 3. 

8 Recall that even for stratified negation cautious reasoning on disjunctive programs is Ilf -hard. 

9 Note that M can be any model, possibly non-minimal, of V. 




Recursive Aggregates in Disjunctive Logic Programs: Semantics and Complexity 211 



Theorem 8. Cautious reasoning over negation-free disjunctive programs , whose ag- 
gregates are all monotone, is in co-NP. 

Proof. By Lemma 2 we can check whether a ground atom A is not a cautious conse- 
quence of a program V as follows: (i) Guess an interpretation M of V, (ii) check that 
M is a model and a f M. The check is clearly polynomial-time computable, and the 
problem is therefore in co-NP. 

Lemma 3. Checking whether an interpretation M is an answer set of an arbitrary 
DLP-^ program V is in co-NP. 

Proof. To prove that M is not an answer set of V, we guess an interpretation M' of V, 
and check that (at least) one of the following conditions hold: (i) M' is a model of V M , 
and M' C M, or (ii) M is not a model of V M . The checking of both conditions above 
is clearly in polynomial time, and the problem is therefore in co-NP. 

Theorem 9. Cautious reasoning over arbitrary DLP-^ programs is in //}\ 

Proof. We verify that a ground atom A is not a cautious consequence of a DLP- 4 program 
V as follows: Guess an interpretation M C /ip and check that (1) M is an answer set 
for V, and (2) A is not true w.r.t. M. Task (2) is clearly polynomial, while (1) is in co-NP 
by virtue of Lemma 3. The problem therefore lies in l/f . 

5 Related Work and Conclusions 

There have been considerable efforts to define semantics for logic programs with recur- 
sive aggregates, but most works do not consider disjunctive programs or do not cover 
all kinds of aggregates. In [4] a partial stable semantics for non-disjunctive programs 
with aggregates has been defined, for which the “standard” total stable semantics is a 
special case, while in [8] a stable semantics for disjunctive programs with has been 
given; but only monotone aggregates are considered. These semantics guarantee the 
same benign properties as ours, namely minimality and coincidence with answer sets in 
the aggregate-free case. On the respective language fragment, [4] intuitively coincides 
with our semantics (but a formal demonstration is still to be done). For [8] there is a 
slight difference when an aggregate function in a negative literal is undefined. E.g., the 
program {cheap not #max{.Y : salary(X)} > 1000} without facts for salary would 
yield the answer set {cheap} w.r.t. [8], while our semantics admits only 0. 

A thorough discussion of pros and cons for the various approaches for recursive 
aggregates has been given in [4,15], so we will only briefly compare our approach with 
previous ones on typical examples. 

The approaches of [2,6,7] basically all admit non-minimal answer sets. In particu- 
lar, program Pi of Example 2 would have 0 and |p(a)} as answer sets. As shown in 
Example 12 (also by Theorem 1), the semantics proposed in this paper only admits 0. 

The approach of [13] is defined on non- disjunctive programs with particular kinds 
of aggregates (called cardinality and weight constraints), which basically correspond to 
programs with count and sum functions. As shown in [4], the program (a:— #sum{(l : 
not a)} < 0.} admits two stable models, 0 and {a}, according to [13], whereas our 




212 



W. Faber, N. Leone, and G. Pfeifer 



semantics only allows for 0 as an answer set. An extension to this approach has been 
presented in [10], which allows for arbitrary aggregates in non-disjunctive programs. 

Finally, the work in [16] deals with the more abstract concept of generalized quan- 
tifiers, and the semantics therein shares several properties with ours. 

Concluding, we proposed a declarative semantics for disjunctive programs with 
arbitrary aggregates. We demonstrated that our semantics is endowed with desirable 
properties. Importantly, we proved that aggregate literals do not increase the computa- 
tional complexity of disjunctive programs in our approach. Future work concerns the 
design of efficient algorithms for the implementation of our proposal in the DLV system. 
Upon completion of this paper, we have learned that yet another semantics has been in- 
dependently proposed in [15]; studying the relationship to it is also a subject for future 
work. 

We would like to thank the anonymous reviewers for their useful comments. 

References 

1. Mumick, I.S., Pirahesh, H., Ramakrishnan, R.: The magic of duplicates and aggregates. In: 
VLDB’90 (1990) 264-277 

2. Kemp, D.B., Stuckey, P.J.: Semantics of Logic Programs with Aggregates. In: ISLP’91, MIT 
Press (1991) 387-401 

3. Ross, K.A., Sagiv, Y.: Monotonic Aggregation in Deductive Databases. JCSS 54 (1997) 
79-97 

4. Pelov, N., Denecker, M., Bruynooghe, M.: Partial stable models for logic programs with 
aggregates. In: LPNMR-7. LNCS 2923., Springer (2004) 207-219 

5. Dell" Armi, T., Faber, W., Ielpa, G., Leone, N., Pfeifer, G.: Aggregate Functions in Disjunctive 
Logic Programming: Semantics, Complexity, and Implementation in DLV. In: IJCAI 2003, 
Acapulco, Mexico, Morgan Kaufmann (2003) 847-852 

6. Gelfond, M.: Representing Knowledge in A-Prolog. In: Computational Logic. Logic Pro- 
gramming and Beyond. LNCS 2408. Springer (2002) 413-451 

7. Dell" Armi, T., Faber, W., Ielpa, G., Leone, N., Pfeifer, G.: Aggregate Functions in DLV. In: 
ASP'03, Messina, Italy (2003) 274-288 Online at http : //CEUR-WS . org/Vol-78/. 

8. Pelov, N., Truszczynski, M.: Semantics of disjunctive programs with monotone aggregates - 
an operator-based approach. In: NMR 2004. (2004) 327-334 

9. Marek,V.W., Remmel, J.B.: On Logic Programs with Cardinality Constraints. In: NMR" 2002 
(2002) 219-228 

10. Marek, V.W., Remmel. J.B.: Set Constraints in Logic Programming. In: LPNMR-7. LNCS, 
Springer (2004) 167-179 

11. Gelfond, M., Lifschitz, V.: Classical Negation in Logic Programs and Disjunctive Databases. 
New Generation Computing 9 (1991) 365-385 

12. Eiter, T., Faber, W., Leone, N., Pfeifer, G.: Declarative Problem-Solving Using the DLV 
System. In Minker, J., ed.: Logic-Based Artificial Intelligence. Kluwer (2000) 79-103 

13. Niemela, I., Simons, P., Soininen, T.: Stable Model Semantics of Weight Constraint Rules. 
In: LPNMR"99. Number 1730 in Lecture Notes in AI (LNAI), Springer (1999) 107-116 

14. Dantsin, E., Eiter, T., Gottlob, G., Voronkov, A.: Complexity and Expressive Power of Logic 
Programming. ACM Computing Surveys 33 (2001) 374-425 

15. Pelov, N.: Semantics of Logic Programs with Aggregates. PhD thesis, Katholieke Universiteit 
Leuven, Leuven, Belgium (2004) 

16. Eiter, T., Gottlob, G., Veith, H.: Modular Logic Programming and Generalized Quantifiers. 
In: LPNMR'97. LNCS 1265, Springer (1997) 290-309 




A Logic for Reasoning About 
Coherent Conditional Probability: 
A Modal Fuzzy Logic Approach 



Enrico Marchioni 1,2 and Llufs Godo 1 

1 Institut d’Investigacio en Intel-ligencia Artificial 
Campus UAB, 08193 Bellaterra, Spain 
{enrico ,godo}@iiia. csic.es 
2 Departamento de Logica, Universidad de Salamanca 
Campus Unamuno, 37007 Salamanca, Spain 
marchioniOusal . es 



Abstract. In this paper we define a logic to reason about coherent 
conditional probability, in the sense of de Finetti. Under this view, 
a conditional probability /u,(- | •) is a primitive notion that applies over 
conditional events of the form “< p given ip” , where ip is not the impos- 
sible event. Our approach exploits an idea already used by Hajek and 
colleagues to define a logic for (unconditional) probability in the frame 
of fuzzy logics. Namely, in our logic for each pair of classical proposi- 
tions ifi and ip, we take the probability of the conditional event “ip given 
ip ”, <p\ip for short, as the truth-value of the (fuzzy) modal proposition 
P(tp \ ip), read as “ip\ tp is probable”. Based on this idea we define a fuzzy 
modal logic FCP(LJ7), built up over the many-valued logic L77 1 (a logic 
which combines the well-known Lukasiewicz and Product fuzzy logics), 
which is shown to be complete with respect to the class of probabilistic 
Kripke structures induced by coherent conditional probabilities. Finally, 
we show that checking coherence of a probability assessment to an arbi- 
trary family of conditional events is tantamount to checking consistency 
of a suitable defined theory over the logic FCP(L17). 



1 Introduction: Conditional Probability and Fuzzy Logic 

Reasoning under uncertainty is a key issue in many areas of Artificial Intel- 
ligence. From a logical point of view, uncertainty basically concerns formulas 
that can be either true or false, but their truth- value is unknown due to incom- 
pleteness of the available information. Among the different models uncertainty, 
probability theory is no doubt the most relevant. One may find in the literature 
a number of logics to reason about probability, some of them rather early. We 
may cite [1,6,7,9,10,14,16,18,19,20,21,22,23,24] as some of the most relevant ref- 
erences. Besides, it is worth mentioning the recent book [15] by Halpern, where 
a deep investigation of uncertainty (not only probability) representations and 
uncertainty logics is presented. 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, pp. 213-225, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




214 



E. Marchioni and L. Go do 



Nearly almost all the probability logics in the above references are based on 
classical two-valued logic (except for [10]). In this paper we develop a proposi- 
tional fuzzy logic of (conditional) probability for which completeness results are 
provided. In [13] a new approach, further elaborated in [12] and in [11], was pro- 
posed to axiomatize logics of uncertainty in the framework of fuzzy logic. The 
basic idea consists in considering, for each classical (two- valued) proposition p, a 
(fuzzy) modal proposition Pip which reads “ip is probable” and taking as truth- 
degree of Pip the probability of i p. Then one can define theories about the Pip’s 
over a particular fuzzy logic including, as axioms, formulas corresponding to 
the basic postulates of probability theory. The advantage of such an approach 
is that algebraic operations needed to compute with probabilities (or with any 
other uncertainty model) are embedded in the connectives of the many-valued 
logical framework, resulting in clear and elegant formalizations. 

In reasoning with probability, a crucial issue concerns the notion of condi- 
tional probability. Traditionally, given a probability measure p on an algebra of 
possible worlds W, if the agent observes that the actual world is in A C W, 
then the updated probability measure //(• | A ), called conditional probability, is 
defined as p(B \ A) = p(B D A)/p(A), provided that p(A) > 0. If p(A) = 0 the 
conditional probability remains then undefined. This yields both philosophical 
and logical problems. 

For instance, in [11] statements about conditional probability are handled by 
introducing formulas P(ip \ if) standing for Pip — >77 P (v 5 A if). Such a definition 
exploits the properties of Product logic implication — whose truth function 
behaves like a truncated division: 



e(<L> — y n P) 



1 , if e(<P) < e(tf') 

e(*?)/e($), otherwise. 



With such a logical modelling, whenever the probability of the conditioning 
event x is P(ip \ x) takes as truth-value 1. Therefore, this yields problems 
when dealing with zero probabilities. 

To overcome such difficulties, an alternative approach (that goes back to the 
30’s with de Finetti, and later to the 60’s with R.enyi and Popper among oth- 
ers) proposes to consider conditional probability and conditional events as basic 
notions, not derived from the notion of unconditional probability. Coletti and 
Scozzafava’s book [4] includes a rich elaboration of different issues of reasoning 
with coherent conditional probability, i.e. the conditional probability in the sense 
of de Finetti. We take from there the following definition (cf. [4]). 



Definition 1. Let Q be a Boolean algebra and let B C Q be closed with respect 
to finite unions (additive set). Let B° = B\ {0}. A conditional probability on 
the set Q x B° of conditional events , denoted as E\H, is a function p : Q x B° — > 
[0, 1] satisfying the following axioms: 

(i) p{H \H) = 1, for all H € B° 

(ii) p( ■ | H) is a (finitely additive) probability on Q for any given H G B° 

(Hi) p(E<~)A | H) = p{E | H)-p(A \ EDH), for all A € Q and E , H, E<~)H £ B°. 




A Logic for Reasoning About Coherent Conditional Probability 215 



In this paper we follow the above fuzzy logic approach to define a logic to 
reason about conditional probability in the sense of Definition 1 1 . Thus, over 
the fuzzy logic LII | we directly introduce a modal operator P as primitive, and 
apply it to conditional events of the form p\x- Unconditional probability, then, 
arises as non-primitive whenever the conditioning event is a (classical) tautology. 
The obvious reading of a statement like P{p | x) is “the conditional event “p 
given x” is probable”. Similarly to the case mentioned above, the truth- value 
of P(p | x) will be given by a conditional probability p(p | x). It is worth 
mentioning a very related approach by Flaminio and Montagna [8] which deals 
with conditional probability in the frame of the fuzzy logic LII 1, but differs 
from ours in that they use non-standard probabilities. 

The paper is structured as follows. After this introduction, in Section 2 we 
overview the basic facts about the fuzzy logic LII 1 . In Section 3 we define our 
conditional probability logic FCP(LTT) as a modal fuzzy logic over LLI 1 and 
prove soundness and completeness results with respect to the intended prob- 
abilistic semantics. Then, in Section 4 we show how the problem of coherent 
conditional probability assessments can be cast as a problem of determining the 
logical consistency of a given theory in our logic. We end with some conclusions. 



2 Logical Background: The Li7| Logic 



The language of the LLI logic is built in the usual way from a countable set of 
propositional variables, three binary connectives — (Lukasiewicz implication), 
0 (Product conjunction) and — >n (Product implication), and the truth constant 
0. A truth-evaluation is a mapping e that assigns to every propositional variable 
a real number from the unit interval [0, 1] and extends to all formulas as follows: 



e(0) 

e{p -t L ip) 
e(p 0 ip) 

e(p y n ip) 



= 0 , 



= min(l — e(p) + e(ip), 1), 
= e(<p) ■ e(ip), 



f 1, if e(<p) < e(ip) 

\ e(ip)/e(p), otherwise 



The truth constant 1 is defined as p — >l P- In this way we have e(l) = 1 for any 
truth-evaluation e. Moreover, many other connectives can be defined from those 
introduced above: 



->l<P is L t l 0, 
p A ip is plk(p — >l ip), 
P®ip is ->lP ->l ip, 
if 0 Ip is if&^Llp, 

Aip is -^n^LP, 



-n n p is p ~^n 0, 

ipV ip is A ~^Lip), 

phip is ^l(^lP ® —'Lip), 
p = ip is (p -> L ip)k(ip -Ai, p), 
\7p is ~^n^nP, 



1 Notice that somewhat similar definitions of conditional probability can be found in 
the literature. For instance, in [15] B° is further required to be closed under supersets 
and Q x B° is called a Popper algebra. See also [4] for a discussion concerning weaker 
notions of conditional probability and their unpleasant consequences. 




216 



E. Marchioni and L. Go do 



with the following interpretations: 



e(-'L^) = 1 - e(p), 



e(~<np) 



e(p A ip) 
e(p © ip) 
e(p 0 ip) 

e(Atp) 



= min (e(p),e(ip)), 

= min(l, e(p) + e(ip)), 
= max(0, e(ip) - e(ip)), 
f 1, if e(<p) = 1 



0 , 



otherwise 



e(p V ip) 
e(pSzip) 
eO = ip) 

e(V<p) 



f 1, if e(<p) = 0 
( 0, otherwise ’ 
max(e(<p), e(tp)), 
max(0, e(p) + e(ip) - 1), 

1 - H<P) - e(V’)!, 

f 1, if e(p) > 0 
( 0, otherwise 



The logic 777 is defined Hilbert-style as the logical system whose axioms and 
rules are the following 2 : 

(i) Axioms of Lukasiewicz Logic: 

(LI) p (ip -»l ip) 

(L 2 ) (ip - 1 L Ip) -+L ((ip -+L x) -tL (ip ->L X)) 

(L3) (~> L p ->L -< L ip) -> L (ip ip) 

(L4) ((ip \ L ip) -> L ip) -* L ((ip -> L p) y L ip) 

(ii) Axioms of Product Logic 3 : 

(Al) (<p -> n ip) -> n ((ip -+ n x) -hi (p -hi x)) 

(A2) (tp © ip) y n ip 
(A3) (ip 0 ip) y n (tp © p) 

(A4) (p © (p — 177 tp) — >77 (ip © (ip — 177 p)) 

(A5a) (p y n (tp ~+n x)) ~hi ((<P © tp) -A 77 x) 

(A5b) ((p © tp) -A 77 x) ~^n (p -hi (tp ->n x)) 

(A6) ((p -177 tp) -hi x) ~hi (((tp -A 77 p) ~^n x) ~^n x) 

(n 1) “ ’ti - ’tjX ->-77 (((p © x) -+n (tp © x)) ->-77 (p -A 77 tp)) 

(772) p A —< n p — 177 0 

(iii) The following additional axioms relating Lukasiewicz and Product logic 
connectives: 

(“0 ~>np ~+L ~^lP 

(A) A(p —> L ip) = A(p -177 ip) 

(777) p@(ipex) = (pQtp)e(pG>x) 

(iv) Deduction rules of 777 are modus ponens for — (modus ponens for — 177 
is derivable), and necessitation for A: from p derive Ap. 

The logic LH\ is the logic obtained from 777 by expanding the language 
with a propositional variable \ and adding the axiom: 

(777 1) \ = h L \ 

Obviously, a truth-evaluation e for 777 is easily extended to an evaluation for 
777 i by further requiring e(|) = g. 

2 This definition, proposed in [3], is actually a simplified version of the original defi- 
nition of 777 given in [5]. 

3 Actually Product logic axioms also include axiom A7 [0 — 1 77 <p] which is redundant 
in 777. 




A Logic for Reasoning About Coherent Conditional Probability 217 



From the above axiom systems, the notion of proof from a theory (a set of 
formulas) in both logics, denoted b Ln and \~Lni /2 respectively, is defined as 
usual. Strong completeness of both logics for finite theories with respect to the 
given semantics has been proved in [5] . In what follows we will restrict ourselves 
to the logic LII \ . 

Theorem 1 . For any finite set of formulas T and any formida ip of Ln ^ , we 
have T T iff e {v) = 1 f or an U truth- evaluation e which is a model 4 ofT. 

As it is also shown in [5], for each rational r £ [0, 1] a formula r is definable 
in Ln\ from the truth constant f and the connectives, so that e(f) = r for 
each evaluation e. Therefore, in the language of L77 | we have a truth constant 
for each rational in [0, 1], and due to completeness of L7T |, the following 
book-keeping axioms for rational truth constants are provable: 

{RLn 1 ) -n L r = T^f 
(RLn2) f —>ls = min(l, 1 — r + s) 

(RLnS) f © s = r"Cs 
(RLnA) f — >n s = r =>p s 

where r =>p s = 1 if r < s, r =>p s = s/r otherwise. 

3 A Logic of Conditional Probability 

In this section we define a fuzzy modal logic, built up over the many- valued logic 
Ln i, that we shall call FCP(LU) — FCP for Fuzzy Conditional Probability — , 
to reason about coherent conditional probability of crisp propositions. 

The language of FCP(LTI) is defined in two steps: 

Non-modal formulas: they are built from a set V of propositional variables 
{pi,P 2 , ■ ■ - Pni ■ ■ ■} using the classical binary connectives A and ->■* Other connec- 
tives like V, — > and -fA are defined from A and -> in the usual way. Non-modal 
formulas (we will also refer to them as Boolean propositions) will be denoted by 
lower case Greek letters <p, ip, etc. The set of non-modal formulas will be denoted 
by C. 

Modal formulas: they are built from elementary modal formulas of the form 
P(tp | x), where ip and x are non-modal formulas, using the connectives of L7J 
(—>l, © 5 —>n) aird the truth constants r, for each rational r £ [0,1]. We shall 
denote them by upper case Greek letters L>, L> , etc. Notice that we do not allow 
nested modalities. 

Definition 2. The axioms of the logic FCP(Ln) are the following: 

(i) Axioms of Classical propositional Logic for non-modal formulas 

(ii) Axioms of Ln | for modal formulas 

We say that an evaluation e is a model of a theory T whenever e(ip) = 1 for each 
ip ST. 



4 




218 



E. Marchioni and L. Go do 



(Hi) Probabilistic modal axioms: 

(FCP1) P(ip -A if | x) ~+L (P(p | x) ~+L I x)) 

(FCP2) | x ) = I x) 

(FCP3) P{<p V if | X) = ((P{V I X) A ^ | X)) P{i> I X) 

(FCPf) P(y> A | x) = T’W’ I V 5 A x) © I x) 

f-Fcro; P( x I X ) 

Deduction rides of FCP(LFl) are those of LII (i.e. modus ponens and necessi- 
tation for A), plus: 

(iv) necessitation for P: from ip derive P(p \ x) 

(v) substitution of equivalents for the conditioning event: from x X / > derive 

P(P I X) = P{V I x') 

The notion of proof is defined as usual. We will denote that in FCP(PP) a 
formula d> follows from a theory (set of formulas) T by T I ~fcp The only 
remark is that the rule of necessitation for P(- | y) can only be applied to 
Boolean theorems. 

The semantics for FCP(PP) is given by conditional probability Kripke struc- 
tures K = (W,U,e,p), where: 

— W is a non-empty set of possible worlds. 

— e : VxW — >-{0,1} provides for each world a Boolean (two- valued) evaluation 
of the propositional variables, that is, e(p,w) £ {0,1} for each propositional 
variable p € V and each world w £ W. A truth-evaluation e(-, w) is extended 
to Boolean propositions as usual. For a Boolean formula ip, we will write 
[p\w = {w £W | e(ip,w) = 1}. 

— p : U x U° [0, 1] is a conditional probability over a Boolean algebra 
U of subsets of W 5 where U° = W\{0}, and such that ([p]w, [x}w) is p- 
measurable for any non-modal ip and x (with [x]w 7 ^ 0 ). 

— e(-,w) is extended to elementary modal formulas by defining 

e {P{v I X),w) = p([<p\w I [x]w) 6 , 

and to arbitrary modal formulas according to LFF 1 semantics, that is: 



e(r, w ) 
e(d> — >l F,w) 
e(<F 0 F, w) 

e(4> — >n F, w) 



= r, 

= min(l — e(<P, w) + e(F, w ), 1), 

= e(<£, w) ■ e(F, w), 

( 1, if e($,w) < e(< P,w) 



\ e(F, w) /e(F, w ) , otherwise 



Notice that if F is a modal formula the truth-evaluations e(<£,w) depend only 
on the conditional probability measure p and not on the particular world w. 

5 Notice that in our definition the factors of the Cartesian product are the same 
Boolean algebra. This is clearly a special case of what stated in Definition 1. 

6 When [x]w = 0, we define e(P(p | x),w) = X. 




A Logic for Reasoning About Coherent Conditional Probability 219 



The truth-degree of a formula <P in a conditional probability Kripke structure 
I\ = (W,U, e, n), written ||<£||^, is defined as 

\\$\\ K = inf e($,w). 
w&W 

When ||^|| A = 1 we will say that T> is valid in K or that K is a model for 
and it will be also written K |= T>. Let T be a set of formulas. Then we 
say that K is a model of T if I\ (= for all T> £ T. Now let Ad be a class 
of conditional probability Kripke structures. Then we define the truth-degree 
of a formula in a theory T relative to the class Ad as 

||#||j^ = inf{||<£|| A | K £ Ad, K being a model of T} . 

The notion of logical entailment relative to the class Ad , written |=_vf , is then 
defined as follows: 



T \=M $ iff \mp = 1 • 

That is, T> logically follows from a set of formulas T if every structure of Ad which 
is a model of T also is a model of T>. If Ad denotes the whole class of conditional 
probability Kripke structures we shall write T \=fcp $ and ||<?|| PCP . 

It is easy to check that axioms FCP1-FCP5 are valid formulas in the class 
of all conditional probability Kripke structures. Moreover, the inference rule of 
substitution of equivalents preserves truth in a model, while the necessitation rule 
for P preserves validity in a model. Therefore we have the following soundness 
result. 

Lemma 1. (Soundness) The logic FCP(LTI ) is sound with respect to the class 
of conditional probability Kripke structures. 

For any tp, if £ £, define ip ~ if iff h ip -H- 'if in classical logic. The relation 
~ is an equivalence relation in the crisp language C and [ip\ will denote the 
equivalence class of </?, containing the propositions provably equivalent to <p. 
Obviously, the quotient set L/^ of classes of provably equivalent non-modal 
formulas in FCP(LTT) forms a Boolean algebra which is isomorphic to a corre- 
sponding Boolean subalgebra B(f?) of the power set of the set PI of Boolean 
interpretations of the crisp language C . For each tp £ C, we shall identify the 
equivalence class [<p] with the set {w £ Q \ = 1} £ B(L?) of interpretations 

that make p true. We shall denote by CV(C) the set of conditional probabilities 
over C/^ FCP x (£/~fcp \ [-L]) or equivalently on B(L?) x B(L?)°. 

Notice that each conditional probability /j £ CT(C) induces a conditional prob- 
ability Kripke structure (17, B(L?), e^, p) where e /i (p, ut) = ui{p) £ {0,1} for 
each u) £ S 7 and each propositional variable p. We shall denote by CPS the 

' Actually, B(L?) = {{cu £ 17 | u>(ip) = 1} | p £ C}. Needless to say, if the language has 
only finitely many propositional variables then the algebra B(17) is just the whole 
power set of 17, otherwise it is a strict subalgebra. 




220 



E. Marchioni and L. Go do 



class of Kripke structures induced by conditional probabilities p € CV(C), i.e. 
CVS = {(17, B(i2), e p , p) | p G CV(C)}. Abusing the language, we will say that 
a conditional probability /x G CV(C) is a model of a modal theory T whenever 
the induced Kripke structure 12^ = (17, B(l7), e M , /x) is a model of T. Besides, we 
shall often write | x) actually meaning p([p\ | [%])• 

Actually, for our purposes, we can restrict ourselves to the class of condi- 
tional probability Kripke structures CVS. In fact, it is not difficult to prove the 
following lemma. 

Lemma 2. For each conditional probability Kripke structure K = (IK, U, e,/x) 
there is a conditional probability p* : B(17) x B(17)° — > [0,1] such that || P(ip \ 
X)|| A = h*(T I x) f or ViX G such that [x] 0- Therefore, it also holds that 

||^||t = ||^||t P ‘ S f or an y modal formula T> and any modal theory T. 

As a consequence we have the following simple corollary. 

Corollary 1. For any modal theory T over FCP(LFI ) and non-modal formulas 
t p and x (with [x] yf 0j the following conditions hold: 

(i) T | =fcp r P{v | x) iff h(T I x) > r f or eacft P G CV(C) model ofT. 

(ii) T \=fcp P(<P | x) * r iff hip | x) < r f° r eac h p G CV(C) model ofT. 

Now, we show that FCP(LTT) is strongly complete for finite modal theories 
with respect to the intended probabilistic semantics. 

Theorem 2. (Strong finite probabilistic completeness of FCP(Lf7)) Let 

T be a finite modal theory over FCP(LF1 ) and a modal formula. Then T I ~fcp 
T> iffeff*) = 1 for each conditional probability model p ofT. 

Proof. The proof is an adaptation of the proof in [11], which in turn is based 
on [13,12] where the underlying logics considered were Lukasiewicz logic and 
Rational Pavelka logic rather than LII \ . 

By soundness we have that T \~fcp{lii) & implies T \=fcp(lii) < T- We have 
to prove the converse. In order to do so, the basic idea consists in transforming 
modal theories over FCP(LTT) into theories over LII 
Define a theory, called IF, as follows: 

1. take as propositional variables of the theory variables of the form f v \ x , where 
(p and x are classical propositions from C. 

2. take as axioms of the theory the following ones, for each ip, and x ; 

(V 1) f v \ x , for ip being a classical tautology, 

(F2) f v !x = f v | x ,, for any x, x' such that x ^ x' is a tautology, 

(•^*3) fy,-+X> \x (f(p\x ~^L ftl>\x)i 

(IFF) f-,<p\ x = 

(•Pff) /(pVj/’lx = [(/y|x f,p Aip\x( ^ /V’lx]’ 

(■^"6) = /blv^x ® /<dx’ 

(F7) U \v 

Then define a mapping * from modal formulas to LII ^-formulas as follows: 




A Logic for Reasoning About Coherent Conditional Probability 221 



!• (P(<P I X))* = U\ x 

2. r* = r 

3. (<PoF)* = <P* otft*, for o G 

Let us denote by T* the set of all formulas translated from T . First, by the 
construction of F, one can easily check that for any <P, 

T \- FCP ( Ln ) <2> iff T* U F \~ Ln i 0*. (1) 

Notice that the use in a proof from T* U F of instances of (.FI) and (F2) corre- 
sponds to the use of the inference rules of necessitation for P and substitution of 
equivalents in FCP(LIl), while instances of (F3) — (F7) obviously correspond 
to axioms (FCP1) - (FCP5) respectively. 

Now, we prove that the semantical analogue of (1) also holds, that is, 

T \=fcp(lii ) * iff T* U F \= Ln t <P * . (2) 

First, we show that each LlJf -evaluation e which is model of T*UF determines 
a conditional probabilistic Kripke model K e of T such that e(4>*) = \\^\\^ for 
any modal formula Actually, we can define the conditional probability p e on 
B(i?) x B(l2)° as follows: 

Me(M I [x]) = e(U\x)- 

So defined p e is indeed a conditional probability, but this is clear since by hy- 
pothesis e is a model of F. Then, it is also clear that in the model K e = 
the truth-degree of modal formulas # coincides with the truth-evaluations e(#*) 
since they only depend on the values of p e and e over the elementary modal 
formulas P(ip | \) and atoms f v \ x respectively. 

Conversely, we have now to prove that each conditional probability Kripke 
structure K = (W,li,e,/j.) determines a LII ^ -evaluation en model of T such 
that e#(^*) = ||^|| K for any modal formula <P. Then, we only need to set 

. \ _ / v{W\w | [x]w), if [x]w 7 ^ 0 

e K{j v \x) - 1 if[x]w = r 

It is easy to see then that ex is a model of axioms FI — F7, and moreover that 
for any modal formula <P, we have exi^*) = ||^|| A - Hence we have proved the 
equivalence ( 2 ). 

From (1) and (2), to prove the theorem it remains to show that 
T*UF b i7J i iff T* U F \= Ln i <p*. 

Note that LII^ is strongly complete but only for finite theories. We have that 
the initial modal theory T is finite, so is T* . However F contains infinitely many 
instances of axioms FI— F7. Nonetheless one can prove that such infinitely many 
instances can be replaced by only finitely many instances, by using propositional 
normal forms, again following the lines of [12, 8.4.12]. 

Take n propositional variables p\ , . . . , p n containing at least all variables in T. 
For any formula p built from these propositional variables, take the correspond- 
ing disjunctive normal form ( p)dnf ■ Notice that there are 2 n different normal 




222 



E. Marchioni and L. Go do 



forms. Then, when translating a modal formula <P into <P* , we replace each atom 
f,p \ x by f(ip) dnf \(x) dn f t° obtain its normal translation <P d n j . The theory T dn j is 
the (finite) set of all where ' P £ T. The theory Tdnf is the finite set of 

instances of axioms T\ — T1 for disjunctive normal forms of Boolean formulas 
built from the propositional variables pi, . . . ,p n . We can now prove the following 
lemma. 

Lemma 3. (i) T* Uf b ii7 i <L>* iff T dn j, U T ' dn f & * dnf . 

(H) Kni iff T* dnf U T dnf h Ln i $* dn f 

The proof of is similar to [12, 8.4.13]. Finally, we obtain the following chain 
of equivalences: 

T b pep $ iff T* U T b pn $ * by (i) above 

iff T dnfUF d nf b i77 i $* dnf by (1) of Lemma 3 

iff T dn f U Tdnf \=Ln i & dnf by finite strong completeness of LIl^ 

iff T* U T \=lii\ *&* by (ii) of Lemma 3 

iff T \=fcp & by (2) above 

This completes the proof of theorem. 

The following direct corollary exemplifies some kinds of deductions that are 
usually of interest. 

Corollary 2. Let T be a finite modal theory over FCP (LIT ) and let ip and y be 
non-modal formulas, with [y] 0. Then: 

(i) T b fcp f P(t I x) iff /r(<p | x) > r > f or each conditional probability 
model /i of T . 

(ii) T b fcp P(,T | x) ► t iff /x(<p | x) < r > f or each conditional probability 
model /i of T. 

It is worth pointing out that the logic FCP(LTJ) is actually very power- 
ful from a knowledge representation point of view. Indeed, it allows to express 
several kinds of statements about conditional probability, such as purely com- 
parative statements like “the conditional event <p\x is at least as probable as the 
conditional event ip\8” as 



P(i> I b) t L P(<p | x). 

or numerical probability statements like 

- “the probability of <^|y is 0.8” as P(ip | y) = 0.8, 

- “the probability of p\\ is at least 0.8” as 0.8 — >l P{p | x)> 

- “the probability of <^|y is at most 0.8” as P(ip | y) —>l 0-8, 

- ‘V|y has positive probability” as ~<n~'nP( ( P | x)> 

or even statements about independence , like “ip and ip are independent given y” 
as 

P{V I X A V’) = P(<P I X)- 




A Logic for Reasoning About Coherent Conditional Probability 223 



4 Applications to the Coherence Problem 

Another well-known solution to overcome the difficulties concerning conditional 
probability when dealing with zero probabilities consists in using non-standard 
probabilities. In this approach only the impossible event can take on probability 
0, but non-impossible events can have an infinitesimal probability. Then 
the non-standard conditional probability Pr*(ip \ ip) may be expressed as 
Pr*(ip A ip) / Pr* (ip) , which can be taken then as the truth- value of the formula 

P{i>) P[ip A ip), 

where P is a (unary) modal operator standing for (unconditional) non-standard 
probability. This is the previously mentioned approach 8 followed by Flaminio 
and Montagna in [8], where the authors develop the logic FP(SLII) in 
which conditional probability can be treated along with both standard and 
non-standard probability. Standard probability Pr is recovered by taking 
the standard part, of Pr* . This is modelled in the logic by means of a unary 
connective S, so that the truth-value of S(Pip) is the standard probability 
of ip. Furthermore, they show that the notion of coherence of a probabilistic 
assessment to a set of conditional events is tantamount to the consistency of a 
suitable defined theory over FP(SLII). 

Definition 3 ([4]). A probabilistic assessment {Prppi | Xi) = a i}i=i,n over a 
set of conditional events Lpi | \i (with \i n °t being a contradiction) is coherent 
if there is a conditional probability p, in the sense of Definition 1, such that 
Pr{pi | Xi) = h(Vi | Xi) for all i = 1, , n. 

Remark that the above notion of coherence can be alternatively found in the 
literature in a different form, like in [2], in terms of a betting scheme. 

Theorem 3 ([8]). Let k = {Pr(ipi | Xi) = on : i = 1, . . . , n} be a rational proba- 
bilistic assignment. Let B the Boolean algebra generated by {p>i, Xi \ i = 1, ■ ■ ■ ,n} 
and let h? and 0 be its top element and its bottom element respectively. Then k 
is coherent iff the theory T* consisting of the axioms of the form ~<n~ , nPr(ip) 
for if € H\{0}, plus the axioms S(P(xP) ~^n P(Ti A Xi)) = &i (i = 1, ... ,n) is 
consistent in FP(SLLI) , i.e. T* Vfp(slii) 0- 

The proof of this theorem is based on two characterizations of coherence, 
given in [4] and [17], using non-standard probabilities, and it is quite compli- 
cated. However in FCP(LII), contrary to FP(SLn), conditional probability is 
a primitive notion, then it can be easily shown that in the logic FCP{LLI ) an 
analogous theorem can be proved in a simpler way. 

Theorem 4. Let n = { Pr(ipi \ Xi) = cti : i = 1, . . . , n} be a rational probabilistic 
assessment. Then k is coherent iff the theory T K = {P(ipi \ Xi) = oti : i = 
1, . . . , n} is consistent in FCP(LLI), i.e. T K \/fcp(lfi) 0. 

8 A related approach due to Raskovic et al. [21] deals with conditional probability 
by defining graded (two-valued) operators over the unit interval of a recursive non- 
archimedean field containing all rationals. 




224 



E. Marchioni and L. Go do 



Proof. Remember that we are allowed to restrict ourselves to the subclass CVS 
of conditional probability structures. Now, suppose that T K is consistent. By 
strong completeness, there exists a model (17, B(I2), e M , p) of T K , hence satisfying 
| Xi) = a i : therefore k is coherent. Conversely, suppose n is a coherent 
assessment. Then, there is a conditional probability p which extends k. Then 
the induced Kripke structure (i7, B(i7), e p , /x) is a model of T K . 

5 Conclusions 

In this paper, we have been concerned with defining the modal logic FCP(LII) 
to reason about coherent conditional probability exploiting a previous fuzzy 
logic approach which deals with unconditional probabilities [11]. Conditional 
probability has been taken as a primitive notion, in order to overcome diffi- 
culties related to conditioning events with zero probabilities. FCP(LII) has 
been shown to be strongly complete with respect to the class of conditional 
probability Kripke structures when dealing with finite theories. Furthermore, 
we have proved that testing consistency of a suitably defined modal theory 
over FCP(Ln) is tantamount to testing the coherence of an assessment to an 
arbitrary set of conditional events, as defined in [4], 

To conclude, we would like to point out some possible directions of our 
future work. First, it will be interesting to study whether we could use a logic 
weaker than LII | , since in fact we do not need in the probabilistic modal 
axioms to explicitly deal with the Product implication connective — >n- Thus, it 
seems it would be enough to use a logic including only the connectives — > l and 
0. Second, it will be worth studying theories also including non-modal formulas 
over the framework defined. Indeed, this would allow us to treat deduction for 
Boolean propositions as well as a logical representation of relationships between 
events, like, for instance when two events are incompatible or one follows from 
another. Clearly such an extension would enhance the expressive power of 
FCP(Ln). Then, from a semantical point of view, we would be very close to 
the so-called model-theoretic probabilistic logic in the sense of Biazzo et al’s 
approach [2] and the links established there to probabilistic reasoning under 
coherence and default reasoning (see also [20] for a another recent probability 
logic approach to model defaults). Actually, FCP(LII) can provide a (syntacti- 
cal) deductive system for such a rich framework. Exploring all these connections 
will be an extremely interesting matter of research in the immediate future. 

Acknowledgments. Marchioni recognizes support of the grant No. AP2002- 
1571 of the Ministerio de Education, Cultura y Deporte of Spain and Godo rec- 
ognizes partial support of the Spanish project LOGFAC, TIC2001-1577-C03-01. 

References 

1. Bacchus, F. Representing and Reasoning with Probabilistic Knowledge. MIT- 
Press, Cambridge Massachusetts, 1990. 




A Logic for Reasoning About Coherent Conditional Probability 225 



2. Biazzo V., Gilio A., Lukasiewicz T., and Sanfilippo G. Probabilistic logic un- 
der coherence, model-theoretic probabilistic logic, and default reasoning. In Proc. 
of ECSQARU-2001, 290-302, 2001. 

3. Cintula P. The LI7 and LI7| propositional and predicate logics. Fuzzy Sets and 
Systems 124, 289-302, 2001. 

4. Coletti, G. and Scozzafava R. Probabilistic Logic in a Coherent Setting. 
Kluwer Academic Publisher, Dordrecht, The Netherlands, 2002. 

5. Esteva F., Godo L. and Montagna F. The LIT and L 77 1 logics: two complete 
fuzzy logics joining Lukasiewicz and Product logic. Archive for Mathematical Logic 
40, 39-67, 2001. 

6. Fagin R., Halpern J.Y. and Megiddo N. A logic for reasoning about proba- 
bilities. Information and Computation 87 (1/2), 78-128, 1990. 

7. Fattarosi-Barnaba M. and Amati G. Modal operators with probabilistic in- 
terpretations I. Studia Logica 48, 383-393, 1989. 

8. Flaminio T. and Montagna F. A logical and algebraic treatment of conditional 
probability. To appear in Proc. of IPMU’04, Perugia, Italy, 2004. 

9. Gaifman H. and Snir M. Probabilities over rich languages, testing and random- 
ness The Journal of Symbolic Logic 47, No. 3, 495-548, 1982. 

10. Gerla, G. Inferences in probability logic. Artificial Intelligence 70, 33-52, 1994. 

11. Godo L., Esteva F. and Hajek P. Reasoning about probability using fuzzy 
logic. Neural Network World 10, No. 5, 811-824, 2000. 

12. Hajek P. Metamathematics of Fuzzy Logic. Kluwer 1998. 

13. Hajek P., Godo L. and Esteva F. Fuzzy logic and probability. In Proc. of 
UAI’95, Morgan Kaufmann, 237-244, 1995. 

14. Halpern J. Y. An analysis of first-order logics of probability. In Proceedings 
of the International Joint Conference on Artificial Intelligence (IJCAI’89), 1375- 
1381, 1989. 

15. Halpern J. Y. Reasoning about Uncertainty. The MIT Press, Cambridge Mas- 
sachusetts, 2003. 

16. Keisler J. Probability quantifiers. In Model- theoretic Logics , J. Barwise and S. 
Feferman (eds.), Springer- Verlag, New York, 539-556, 1985. 

17. Krauss P. H. Representation of conditional probability measures on Boolean 
algebras. In Acta Mathematica Academiae Scientiarum Hungaricae, Tomus 19 
(3-4), 229-241, 1969. 

18. Nilsson N. J. Probabilistic logic Artificial Intelligence 28, No. 1, 71-87, 1986. 

19. Ognjanovic Z., Raskovic M. Some probability logics with new types of proba- 
bility operators. Journal of Logic and Computation , Vol. 9, Issue 2, 181 195, 1999. 

20. Raskovic M., Ognjanovic Z. and Markovic Z. A probabilistic approach to 
default reasoning. In Proc. of NMR 2004, Whistler (Canada), 335-341, 2004. 

21. Raskovic M., Ognjanovic Z. and Markovic Z. A logic with conditional prob- 
abilities. In Proc. of JELIA ’2004, in this volume. 

22. Scott D. and Krauss P. Assigning probabilities to logical formulas In Aspects 
of Inductive Logic, J. Hintikka and P. Suppes (eds.), North-Holland, Amsterdam, 
219-264, 1966 

23. VAN der Hoek, W. Some considerations on the logic PFD. Journal of Applied 
Non-Classical Logics Vol. 7, Issue 3, 287-307, 1997. 

24. Wilson N. and Moral S. A logical view of probability In Proc. of the 11th 
European Conference on Artificial Intelligence (ECAI’94), 386-390, 1994. 




A Logic with Conditional Probabilities 



Miodrag Raskovic 1 , Zoran Ognjanovic 2 , and Zoran Markovic 2 
1 Uciteljski fakultet 

Narodnog fronta 43, 11000 Beograd, Srbija i Crna Gora 
miodragrOmi . sanu. ac . yu 
2 Matematicki Institut 

Kneza Mihaila 35, 11000 Beograd, Srbija i Crna Gora 
zoranoOmi . sanu . ac . yu, zoranmOmi . sanu . ac . yu 



Abstract. The paper presents a logic which enriches propositional cal- 
culus with three classes of probabilistic operators which are applied to 
propositional formulas: P> s (a), CP =s (a, f3) and CP> s (a, /3), with the 
intended meaning ’’the probability of a is at least s”, ’’the conditional 
probability of a given j3 is s”, and ’’the conditional probability of a given 
/ 3 is at least s”, respectively. Possible- world semantics with a probabil- 
ity measure on sets of worlds is defined and the corresponding strong 
completeness theorem is proved for a rather simple set of axioms. This 
is achieved at the price of allowing infinitary rules of inference. One of 
these rules enables us to syntactically define the range of the probabil- 
ity function. This range is chosen to be the unit interval of a recursive 
nonarchimedean field, making it possible to define another probabilistic 
operator CP a i{a, f3) with the intended meaning ’’probabilities of a A (3 
and (5 are almost the same”. This last operator may be used to model 
default reasoning. 



1 Introduction 

The problem of reasoning with uncertain knowledge is an ancient problem dating, 
at least, from Leibnitz and Boole. In the last decades an approach was developed, 
connected with computer science and artificial intelligence, which starts with 
propositional calculus and adds ’’probability operators” that behave like modal 
operators. Consequently, the semantics consists in special types of Kripke models 
(possible worlds) with addition of probability measure defined over the worlds 
[6,7]. The main problem with that approach is providing an axiom system which 
would be strongly complete. This results from the inherent non-compactness of 
such systems. Namely, in such languages it is possible to define an inconsistent 
infinite set of formulas, every finite subset of which is consistent (e.g., {->P = oa}U 
{P<i/n a '■ n is a positive integer}). Building on our previous work [14,15,16, 
17], we define a system which we show to be sound and strongly complete, 
using infinitary rules of inference (i.e., rules where a conclusion has a countable 
set of premises). Thus, all formulas, axioms and theorems are finite, but the 
proofs might be countably infinite. Since we already have infinitary rules, we 
also introduce another infinitary rule which enables us to syntactically define 



J.J. Alferes and J. Leite (Eds.): JELIA 2004, LNAI 3229, pp. 226-238, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




A Logic with Conditional Probabilities 



227 



the range of the probability function which will appear in the interpretation. 
We choose here this range to be the unit interval of a recursive nonarchimedean 
field containing all rational numbers (an example of such field would be the 
Hardy field Q [e] , where e is an infinitesimal) . A similar rule was given in [2] but 
restricted to rationals only. In this paper we introduce, in addition to the usual 
probabilistic operators P> s ct (with the intended meaning ’’the probability of a is 
at least s”), also the conditional probability operators: CP= s (a, (3), CP> s (a, (3) 
with the intended meaning ’’the conditional probability of a given f3 is s”, ”at 
least s” , respectively. Since we specify, already in the syntax, that the range 
of probability is nonarchimedean, it is possible also to introduce the conditional 
probability operator CP~i(a, (3) with the intended meaning ’’the probabilities of 
a A [3 and (3 are almost the same” . It turns out that this formula may be used to 
model defaults. In a companion paper [18] it is shown that, if we restrict attention 
only to formulas of this type, the resulting system coincides with the system P of 
[12] when we work only with the finite sets of assumptions. If we allow inference 
from an infinite set of ’’defaults” our system is somewhat stronger. The main 
advantage, however, is that we can use the full probability logic and thus express 
explicitly properties that cannot be formulated in the language of defaults. 

There are not too many papers discussing conditional probabilities from the 
logical point. We are aware of only one paper [7] in which conditional probability 
is defined syntactically. However, a complicated machinery of real closed fields 
was needed to obtain a corresponding sound and complete axiomatization. In 
our approach, since the parts of field theory are moved to the meta theory, the 
axioms are rather simple. Also, we are able to prove the extended completeness 
theorem (’every consistent set of formulas has a model’) which is impossible 
for the system in [7], although at a price of introducing infinitary deduction 
rules. One should add that systems with infinitary rules of inference may be 
decidable which remains to be determined for the present system. Conditional 
probability is also analyzed in [4] but only on the semantical level along the 
ideas proposed by de Finetti. In [1,9,8,13] conditional probabilities are used in 
the field of nonmonotonic reasoning, but without any axiomatization. 

The rest of the paper is organized as follows. In Section 2 syntax of the logic 
is given. Section 3 describes the class LPP^ eas Neat of measurable models, while 
in Section 4 a corresponding sound and complete axiomatic system is introduced. 
A proof of the completeness theorem is presented in Section 5. In Section 6 we 
describe how our system can be used to model default reasoning and analyze 
some properties of the corresponding default consequence relation. We conclude 
in Section 7. 



2 Syntax 



Let S be the unit interval of a recursive nonarchimedean held containing all 
rational numbers. An example of such held is the Hardy held Q[e\. Q[e] contains 
all rational functions of a hxed infinitesimal e which belongs to a nonstandard 




228 M. Raskovic, Z. Ognjanovic, and Z. Markovic 



elementary extension R* of the standard real numbers [10,19]. We use ei, £ 2 , 
... to denote infinitesimals from S. 

Let {so, Si, . . .} be an enumeration of S. The language of the logic consists 
of: a denumerable set Var = {p,q,r , . . .} of propositional letters, classical con- 
nectives -i, and A, a list of unary probabilistic operators (P> s ) se s, a list of bi- 
nary probabilistic operators (CP> s ) se s, a list of binary probabilistic operators 
{CP= s )s£S and a binary probabilistic operator CP~\. 

The set Fore of classical propositional formulas is the smallest set X con- 
taining Var and closed under the formation rules: if a and f3 belong to X, then 
-1 a and (aA/3), are in X. Elements of Fore will be denoted by a, (3, . . . The set 
Forp of probabilistic propositional formulas is the smallest set Y containing all 
formulas of the forms: P> s a for a £ Fore, s € S, CP =s (a, /3) for a, f3 £ Fore, 
s £ S, CP> s (a, (3) for a, (3 £ Fore, s £ S and CP~\{a, (3) for a, (3 £ Fore, and 
closed under the formation rules: if A and B belong to Y , then ->A, and (A A B) 
are in Y . Formulas from Forp will be denoted by A, P, ... Note that we use 
the prefix notation CP> s (a, (3) (and similarly for CP- s (a,(3) and CP~\{a, (3)) 
rather than the corresponding infix notation aCP> s (3 ( aCP =s (3 , aCP~\(3). 

As it can be seen, neither mixing of pure propositional formulas and prob- 
ability formulas, nor nested probabilistic operators are allowed. For example, 
a A P> s /3 and P> s P> r a are not well defined formulas. 

The other classical connectives (V, — >, o) can be defined as usual, while we 
denote -> P> s a by P< s a P>i_ s -ia by P< s a , -1 P< s a by P> s a, P> s a A ~^P >s a by 
P= s a, -P =s a by P #s a, ~^CP> s {a,(3) by CP <s (a,/3), CP <s (a,/ 3) V CP= s (a,(3 ) 
by CP< s (a, (3), and CP> s (a,/3 ) A - <CP= s (a,(3 ) by CP >s (a,/3). 

Let For s = Fore U Forp. ip, if), .. . will be used to denote formulas from the 
set For s . For a £ Fore, and A £ Forp, we abbreviate both ->(a -A a) and 
-1 (A — > A) by _L letting the context determine the meaning. 

3 Semantics 

The semantics for For s will be based on the possible- world approach. 
Definition 1. An LPP S -model is a structure {W,F3,p,v) where: 

— W is a nonempty set of elements called worlds, 

— FI is an algebra of subsets ofW, 

— p : H — >■ S is a finitely additive probability measure, and 

— v : W x Var — > {true, false} is a valuation which associates with every world 

w £ W a truth assignment v(w) on the propositional letters. 

The valuation v is extended to a truth assignment on all classical propositional 
formula. Let M be an LPP S model and a £ Fore- The set {ic : v(w)(a) = true} 
is denoted by [o]m- 

Definition 2. An LPP S -model M is measurable if[a]M is measurable for every 
formula a £ Fore (i-e., [o]m £ H). An LPP S -model M is neat if only the empty 




